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DISTRIBUTION-FREE TESTS OF FIT FOR CONTINUOUS 
DISTRIBUTION FUNCTIONS ':? 


By Z. W. BrrnBpaum 
University of Washington 


Summary. A class of statistics, large enough to comprise those used in all 
the known distribution-free tests of fit for continuous distribution functions, is 
characterized by a structure called “structure (d).”’ A number of statistics of 
this class may be constructed and used for tests of fit. To make a reasonable 
choice among all these statistics, it appears desirabie to introduce in the space 
of continuous distribution functions a distance which would reflect the type of 
discrepancy the proposed test is intended to detect. By studying the power of 
various statistics with regard to this distance one may then be able to choose 
those with optimal properties. 


1. Introduction. The main object of this paper is to discuss techniques for 
deciding whether a sample X,, X2,---, X, of a one-dimensional random 
variable X was obtained from a population which has a completely specified 
continuous cumulative distribution function H(z). More specifically, we shall 
limit ourselves to the following problem. 

Given a continuous cumulative probability function H(z) = Prob {X s x} 


t 

and a class @ of continuous cumulative probability functions different from 
H (x); required is a procedure which, for every sampie X,, --- , X,, will enable 
us to decide whether to accept the “hypothesis” H(x) or the “alternative” @. 
This procedure should be distribution-free for every sample-size n, in a sense 
which will be described in Section 3. Procedures of this kind are referred to as 
“distribution-free tests of fit.’’ In the course of our presentation it will appear 
necessary to formulate the concept of distribution-free statistics and to discuss 
some of its properties. 


2. Some known tests. The following is a concise description of some distribu- 
tion-free tests of fit and the statistics on which they are based. The tests men- 
tioned here are chosen mainly for illustrative purposes and no attempt at com- 
pleteness has been made. 

2.1. The chi square test’. This test compares the empirical histogram deter- 
mined by a sample with the “expected” histogram determined by the hypothesis 
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H(x), by means of the x’ statistic. The limiting distribution of this statistic is 
known and extensively tabulated, but little is known about the manner in which 
its exact distribution for finite sample size approaches the limiting distribution 
when the hypothesis is true, and even less when the alternative is true. The chi- 
square test is not distribution-free in the strict sense, since for finite sample 
size x’ is not a distribution-free statistic. It is being mentioned here mainly for 
historical reasons, as the procedure to which the term “test of goodness of fit”’ 
was originally applied. 

2.2 Smirnov’s w'-test. Let F,(x) be the “empirical distribution function” de- 
fined by 


F, (xz) = k/n if k sample values are S x, k oe 


Modifying statistics proposed earlier by Cramér and v. Mises, Smirnov [2! 
compares F,,(x) with the hypothesis H(x) by means of the statistic 


(2.21) Jina / "TP. (2) — HF dH(2) 


which, after some algebra, may also be written 


(2.22) w= +> | wx) -== ‘] 
12n inl n 
Smirnov showed that, if H(x) is true, the probability distribution of w” is inde- 
pendent of H(x) for any n, and he obtained an asymptotic expression for this 
probability distribution for n — «. 
2.3. The W*.-tests. In [3] T. W. Anderson and D. A. Darling consider a general- 
ization of w, given by 


2.31) wWi,= nf [F,.(x) — H(x)/Y[H(2)] dH (2), 
where y(t) = 0, for 0 S ¢t < 1, is a given weight function. This statistic can be 
rewritten 

n ( ane \ 
2.32) Ww? =2>5(o[H(X)] - a ! etH(x)|s + kaw, 

j=1 | 2 
where #;(¢), :(¢) are functions determined by y(t) only. It is easily shown that 
for any Y(t) and n the probability distribution of W*,, if H(z) is true, is inde- 
pendent of H(x). For ¥(t) = 1, (2.31) reduces to (2.21) and thus Smirnov’s 
w is obtained as a special case. Anderson and Darling present a general method 
for obtaining the asymptotic distribution of W%, for n > «. For w their method 
yields an expression different from that given by Smirnov, better adapted for 
computation. A table obtained by using this expression is published in [3]. 

2.4. The W*-test with y(t) = 1/t(1 — #). Since for x small the empirical dis- 

tribution function F(x) and the hypothetical H(x) are both close to 0, and for 
x large both are close to 1, the w test is likely not to detect a discrepancy in the 
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tails of the distribution. For cases where such discrepancies are of importance, 
Anderson and Darling [3] consider the weight function y(t) = 1/t(1 — t). They 
derive an asymptotic expression for the distribution of the resulting W%, statis- 
tic, but a tabulation of this distribution is not available. 

2.5. Kolmogorov’s test. Kolmogorov [4] introduced the statistic 
(2.51)) D,= sup |F,(x) — H(z)|, 

—OC TK TeO 

showed that its probability distribution, if H(x) is true, is independent of H(z), 
and derived the asymptotic probability distribution of D, for n— «. A tabula- 
tion of this asymptotic distribution was given by Smirnov [5]. In [4] Kolmogorov 
also obtained recursion formulae which make it possible to compute the probabil- 
ity distribution of D, for finite n. This distribution has been tabulated by Mas- 
sey [6], [7], and Birnbaum [8]. Wald and Wolfowitz {9} considered a more general 
class of distribution-free statistics and showed how their probability-distribu- 
tions can be computed for finite n. 

2.6. The K,,-tests. In generalization of Kolmogorov’s D, , Anderson and Dar- 
ling [3] consider the statistic 
(2.61) K,= sup Vn|F,(x) — H(x)| VyfH(2)I, 

—e<cr<cte 

where (1) 2 0 is a given weight function. This statistic may also be written in 
the form 


1 — 2 
(262) K,= ee Max {V¥{H(X;)] Max [nH(X,) — (j — 1), 7 — nH(X)))}. 


=l,--+.n 


The probability distribution of K, , if H(x) is true, does not depend on H(z), 
and a method for obtaining the asymptotic distribution of K, for n — = is 
developed in [3]. Kolmogorov’s D, is equivalent with the special case obtained 
from (2.61) by setting ¥(t) = 1. Anderson and Darling consider also the im- 
portant special case y(t) = 1/t(1 — t) which yields a statistic suitable for de- 
tecting discrepancies in the tails of the distribution. They have not succeeded, 
however, in obtaining an expression for the asymptotic distribution which would 
lend itself for practical use. 

2.7. Tests related to spacing of sample values. If X,; S X2 S--- S X, is an 
ordered sample of a random variable with probability distribution H(z), then 
the expectation of H(X,.;) — H(X,) is 1/(m + 1) fori = 0,1, --- , n, with the 
notations H(X,)) = 0, H(Xas) = 1. Any statistic which evaluates the dis- 
crepancy between this expected and the actual spacing of the values H(X;) may 
be used to test the hypothesis H(x). Thus Kimball [10] considers the statistic 


n+1 


(2.71) 2X | Hox es : i] 


without deriving its asymptotic distribution. Moran [11] uses the statistic 
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(2.72) S (H(X) — H(X.0E 
i=] 


which differs from (2.71) by a constant, and shows that it is asymptotically 
normally distributed. Sherman [12] recently proposed the statistic 


n+1 


2.73) 5 HX) - HX, 


derived expressions for its distribution for finite n, and showed that it is asymp- 
totically normal. 


2.8. One-sided tests of fit. Wald and Wolfowitz [9] studied the statistic 


(2.81) D, = sup [F,(z) — H(x)] = Max l2 - Hx) | 

—ecr<t+e j=l,--sn LT 
which, if H(z) is true, has a probability distribution independent of H(x), and 
obtained expressions for this distribution for finite n. Birnbaum and Tingey 
[13] obtained an alternative expression for the distribution of Dj, for finite n 
and tabulated it. Smirnov [14] obtained the asymptotic distribution in form of 
an elementary function. 


3. On the concept of a distribution-free statistic. 

3.1. Statistics of structure (d). The tests described in the preceding sections are 
all based on statistics which can be written in the form ®[H(X,), --- , H(X,)], 
where ®(U,,---, U,) is a measurable symmetric function of Ui,---, U,. 
We will refer to such statistics as statistics of structure (d) and investigate how 
this particular structure of a statistic is related to its distribution-free character. 

3.2. Distribution-free and strongly distribution-free statistics. Let 2 be a family 
of cumulative probability functions. We shall say that S [X,,--- ,X,,Glisa 
distribution-free statistic in Q if 

(i) for every G €Q it is a measurable function of X,,--- , X,, and 

(ii) whenever X,, --- , X, isa sample of a random variable X with the cumu- 
lative probability function G, the probability distribution of S[X, ,--- , X,, G| 
is independent of G, that is, 


Prob {S[X,,---,X.,,G] S s;G} = ¢(s) 


where ¢(s) depends only on s. 

It is easily verified that if a statistic has structure (d) then it is distribution- 
free in the family 2 of all continuous cumulative distribution functions. It 
can, however, be shown by counter-example [15] that not every distribution- 
free symmetric statistic in 2) is of structure (d). 

For every continuous cumulative probability function G(x) an inverse func- 
tion G'(u) may be uniquely defined in 0 S u S 1 by setting 

G'(0) = —x 
G‘(u) = greatest lower bound of x such that G(x) = u,for0 <u <1. 
If G(x) < 1 for all real x, this definition shall mean G(1) = +2. 
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Let F(x) and G(x) be two continuous cumulative probability functions. The 
function r(u) = F G"(u), 0 S u S 1, constitutes a monotone mapping of the 
unit-interval into itself. If that mapping is the identity, i.e. if r(u) = u, then 
and only then F(z) = G(x), and therefore the function 7(u) — u may be inter- 
preted as a very detailed description of the discrepancy between G(x) and F(z). 

Let 2 and ©’ be two families of continuous cumulative probability functions. 
We shall say that S[X,, --- , X, , G| is strongly distribution-free in Q with respect 
to 2 if 

(i) for every G €Q it is a measurable function of X,,--- , X,, and 

(ii) the probability distribution of S[X,,---, X,,G|, where X,,---, X, 
are a sample of a random variable X with a cumulative probability function 
F €Q and G is any element of 2, depends only on F G™, that is, 


Prob {S[X,,---, X,, G] < 8s; F} = y(s, FG"). 


It is obvious that if S[X,,---, X,, G| is strongly distribution-free in Q 
with respect to 2’ and if 2 C 0’, then S[X,,---, X,, G| is distribution-free 
in Q. 

The following theorem can be proven [15]. If 2 is the class of all strictly 
monotone continuous cumulative probability functions, 2’ = Q, and 
S[X,,--: ,X,, G] is symmetric in X,, --- , X, and strongly distribution-free, 
then it has structure (d). 


4. On choosing a distribution-free test of fit. 

4.1. Need and criteria for making a choice. A number of distribution-free tests 
of fit have been described in Section 2, a number of other such tests have been 
proposed in literature, and still more can be constructed by selecting additional 
statistics of structure (d) and adapting them for the use in tests of fit. For any 
such statistic S[X,,---, X,, H] the probability distribution, if H(x) is true, 
does not depend on H(x). It may therefore be assumed that H(x) is the uni- 
form probability distribution, and under this assumption it is usually possible 
to write the cumulative probability function of the statistic in form of a mul- 
tiple integral of a constant integrand over a polyhedral region. This integral 
can sometimes be evaluated explicitly, sometimes it can be reduced to a system 
of recursion formulae, or it may be possible to tabulate it numerically with the 
aid of modern computing equipment. 

The statistician is, therefore, faced with the problem of deciding in concrete 
situations on using one of the already known tests for which the necessary 
tables are available; or, on a more theoretical level, he may have to decide which 
of the many more possible tests deserve to be studied and developed. 

Besides obvious reasons of expediency, such as availability of tables, ease of 
computation, simplicity in use by untrained personnel, the statistician will have 
to consider various properties which make some of the tests theoretically more 
or less advantageous. For example, having to choose between the chi square and 
the Kolmogorov test, he will have to consider that the former can be adjusted 
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to parametric families where the parameters are to be estimated from the sam- 
ple, while the latter has the advantage that it requires no arbitrary grouping of 
observations and that the exact probability distribution for finite sample sizes 
has been extensively tabulated. 

It would appear, however, that the most essential preliminary consideration 
should be to determine what kind of discrepancy between hypothesis and al- 
ternative is materially important in a concrete situation. Then one may attempt 
to select a test best capable of detecting this kind of discrepancy. For example 
the chi square test is sensitive for discrepancies in the histograms, while the 
Kolmogorov test appears more likely to detect vertical discrepancies between 
the cumulative probability functions. 

In order to judge how good a distribution-free test of fit is for a definite prob- 
lem, one has therefore first to decide on a way to measure discrepancies between 
distributions by introducing in the appropriate space of probability distribu- 
tions a distance which may either satisfy the axioms of a Hausdorff metric or 
some other set of postulates. Once it is defined, it may be possible to study the 
power of various tests with regard to this distance and to select the test which is 
optimal with regard to some of the well known properties based on the power. 

4.2. Distances based on r(u) = F G™ (u). Since practically all distribution- 
free statistics used for tests of fit are strongly distribution-free, it seems desirable 
to use a metric which ascribes the same distance 6(F,, G) to all pairs F, G for which 
7(u) = F G” (u) is the same. Examples of such distances are 





(4.21) Vv [ [r(u) — ul du = fr [F(x) — G(x)? dG(z), 


(4.22 | ir(u) — ul du = [- | F(x) — G(x) | dG(z), 


“ee 


(4.23) sup | r(u) | sup F(x) — G(z) |, 


0<u<l —ea<r<ct+o 


(4.24) sup [r(u) — u] sup [F(x) — G(zx)). 


0<u<l —w<zr<t+o 


While (4.23) defines a Hausdorff metric, the other three expressions define 
directed distances FG. 

On intuitive grounds one would be inclined to use Smirnov’s w’ statistic (2.21), 
(2.22) if discrepancies described by metric (4.21) are considered important, 
Kolmogorov’s statistic (2.51) for the metric (4.23), the statistic D} in (2.81) 
if (4.24) is the discrepancy that matters, etc. A systematic treatment would 
possibly require the introduction of a distance which depends on n. 

Very few attempts have been made at studies in this direction. Mann and 
Wald [16] investigated the power of the chi square test with regard to the dis- 
tance (4.23). An elaboration of their results, together with useful numerical 
tabulations, was made by Williams [17]. A comparison of the power of the chi 
square test with that of Kolmogorov’s test, based on the metric (4.23), was made 
by Massey [7] who, as one would expect, found the latter test more powerful. 
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4.3. Power of tests using strongly distribution-free statistics. Let S(X, ,---, 
X, ,@) be astrongly distribution-free statistic in 2 with respect to’ and #[S(X, , 
-, X,, H)] a (randomized) test function for the simple hypothesis H(x), 
that is, a function such that (1) 0 s @[S(X,,---,X,,H)] S 1 for all X,, 
X, in the sample space and every H €Q, and (2) @[S(X,,--- , X,, H)] is the 
probability of rejecting H. Then, the power of the test depends only on r(u) = 
FH (u), that is we have 


EYP[S(X1, +++, Xn, H)]; F} = v[r(u)] 


for all H €Q, F €Q’. 

Let 6(F, G) be a distance depending only on r(u), such as the distances de- 
scribed in 4.2. For fixed H one may consider the “sphere’’ consisting of all 
F €Q’ such that 6(F, H) = 6), and try to determine the greatest lower bound of 
the power for all these F, which will depend only on the distance 6): 

inf E{@[S(X,,---,X,, MH); F} = wld). 
5( F,H)=35 
A problem of this kind has been treated in [18] where a sharp lower (and upper) 
bound for the power is obtained for a one-sided test of fit, with regard to the 
directed metric (4.24). 

4.4. The alternative described in terms of a distance. The problem of testing the 
simple hypothesis that the true cumulative probability function is exactly equal 
to a completely specified H(x) is, in this formulation, somewhat unrealistic. 
In fact, it leads to the known difficulty that if a consistent test is used, for a 
sufficiently large sample one will practically always reject the hypothesis. It 
is well known that this difficulty can be avoided by stating carefully the hypothe- 
sis and the alternative. If a distance 5(F, G) is defined in Q, this can be done by 
considering the simple hypothesis H(x) and the composite alternative consisting 
of all F eQ such that 6(F, H) 2 6,. With this formulation of the problem, the 
hypothesis will be rejected only if the test produces empirical evidence that the 
true distribution differs from H by a distance of at least 6, . 
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CONTRIBUTIONS TO THE STATISTICAL THEORY OF COUNTER 
DATA! 


By G. E. ALBert anp Lewis NELSON 
University of Tennessee and Oak Ridge National Laboratory 


Summary. A new mathematical model is proposed for the action of counters 
such as the Geiger-Mueller or the scintillation counters. It is assumed that after 
each registration the counter is inoperative for a time interval of random length. 
The distribution of lengths of the inoperative periods is so defined that the Type 
I and Type II models familiar in the literature on counters are special cases. 
More important, it also allows an action that is a compromise between those 
two types. Assuming that the sequence being counted is a Poisson process with 
mean rate of occurrence m7’, m > 0, in an arbitrary interval of length 7, the 
process generated by the counter is discussed and rules are established for ob- 
taining confidence intervals for the parameter m from various types of count- 
ing experiments. 


’ 


1. The counter selection principle; formulation of the distribution problem. 
A counter observes a segment of a sequence {f} of events f, , fe, fs, --- that 
are randomly spaced on the positive time axis, ¢ > 0. Due to an inherent re- 
solving time, the instrument may fail to record all of the events of {f} that 
occur during the interval of observation. Thus the recorded events (registrations 


by the counter) form a second sequence {g} of events q , ge, gs, °°: also ran- 
domly spaced in time with a distribution law that depends upon that of {f} 
and upon the mathematical model used for the action of the counter. Thus, a 
precise rule for the selection by the counter of the sequence |g} from the sequence 
|f} must be specified. Two such rules have received attention in the literature 
on counters. Briefly, they are as follows. 

In a Type I counter there is a fixed resolving time u > 0 such that an event 
of {f} is selected for {g} if and only if no event of {g} has taken place during 
the preceding time interval of length u. In a Type II counter the resolving time 
is random and specified by the rule that an event of |f{ is selected for {g} if 
and only if no event of {f} has taken place during the preceding time interval 
of length wu. 

In a Type II counter an event of {f} that occurs while the counter is locked 
prolongs the locked period. In theory the counter can remain locked indefinitely. 
This is not true of a Type I counter. It has been recognized by some authors on 
this subject that actual counters select {g} from {f} in some manner that is a 
compromise between the types I and IT rules; (Feller [2}). Such a compromise will 

Received 4/7/52. 
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be proposed and used in this paper. It includes the above rules as special cases. 
Briefly, it will be assumed that the Type II rule holds with the exception that an 
event of {f} that occurs while the counter is locked may or may not prolong the 
locked period subject to chance. This will be made more precise in the following 
abstraction. 

General rule of selection. Specify a positive number u and a number @ in the 
range 0 S 6 S 1. A sequence {f} of events is assumed generated on the positive 
time axis by some (physical) stochastic process. The successive events of {f} 
will be denoted by fi , fe , fs, --- . A subsequence of these, fi, , fr. , fis, °°* 
will be selected to be the respective elements g: , go, gs, --* of {g} according 
to the following: 

(i) The event fi is selected as g; ; that is, ki = 1. 

(ii) Following each selection of an event fi, as g,, there is a time interval 
tT, , to be defined in (iii), during which no further selection for {g} may be made. 
That event of {f} first following g, by time at least 7, is selected as g,4:. That 
is, k,41 is the smallest integer such that the time interval between f,, and fi,., 
is of length 7, or more. 

(iii) The time 7, is random in the range r, 2 u > 0. It is selected as follows. 
From the segment of the sequence {f} 


(1) Fuso Feat o feasts rs 
(n) . ry . 
choose a subsequence {f'"'} whose elements are as follows. The first element 


fx, of (1) is the first element f{” of (f°. the succeeding events of (1) are con- 
sidered in succession and independently as candidates for { f‘"’}. Each is either 


(n)) 


selected or rejected for {f*"'| with probabilities 6 and 1 — @ respectively. Let 
fs” be the first element of {f‘"'} such that the time interval between f;" and 
3 is u or more in length. The time 7, is defined to be the sum of u and the 


{” to fi”. (Observe that f{7} is not necessarily the element 9,4 defined 


time from f 
in (ii) above.) 

Note that if @ = 0, the sequence {f'"’} consists of the single element f;, . Thus, 
rt, = u and the general rule reduces to the Type I rule. If @ = 1, {f°} is the 
sequence (1) and the general rule becomes the Type II rule. Other values of @ lead 
to a compromise between those two rules. 

Some notations will be needed before proceeding. Probability functions and 
cumulative distributions and their dependence upon parameters and conditions 
will be denoted as in Cramér [1]. Let S;, Se, --- ,S,, +--+ denote the times of 
occurrence of the successive events of {g}. From the standpoint of a counter, 
each of the intervals S,,; — S,,n 2 1, is the sum of two parts, a time 7, during 
which the counter is inoperative following the occurrence of g, and a time 7, 
during ~vhich the counter is ready to make the next selection g,4; . Thus, 


(2) Say = S, = + Ze, n e E. 


It is implicitly assumed in the rule of selection that the counter is unlocked at 
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time zero. To complete the above notations, let JT) = S,. Define the quantities 
En,» Mn, Sn by 


n—1 
En Z T;, 


{ 

| . 

| j=0 
| 
) 


n—1 


| Mn-1 yo (7; _ u), = 4 Nn = 0, 
ae 
L fn = Sa — (n — 1)u. 

Actually a counter generates only a segment of the sequence {g}; that is, a 
counter observes {f} only during some finite time interval 0 < t < T. Two types 
of intervals of observation are used in practice. The number T may be the random 
time Sy where N is some preassigned integer or T may itself be preassigned. The 
first case will be designated as a fixed count experiment. The distribution function 
of interest is 


(4) Hy(t) = P(fw S 2), t> 0. 


The second case will be designated as a fired time experiment. Here the number 
n(T) of events 91, g2, °°: , Gncr) Of {g} that occur in 0 < ¢t S T is a random 
variable with a distribution function 


(5) P{n(T) = N; T), N = 0,1,2,---, 


which may be expressed in terms of (4). Let P, = Pi[n(T) = k; T|, k = 
0, 1, 2,---. Clearly, Py = P(S, > T) = 1 — H,(T) and fork 2 1, P, = 
P(S, ST, Stang > T;T) = A[T — (k — 1)uj — AyailT — ku). It follows that 


(6) P(n(T) SN; T) = 1 — Hwy(T — Nu). 


Thus the determination of the function (4) solves the distribution problem for 
either type of observation interval. 


2. Calculation of the distributions for a Poisson case. Two general theories 
have been given in the literature for the treatment of a distribution problem 
such as that in hand; see Feller [2] and Malmquist [7]. Malmquist introduces 
certain auxiliary distributions which the present authors feel are extraneous to 
the counter problem. Feller presents a simple, elegant theory based upon opera- 
tional calculus. His method will be followed here. 

It will be assumed that the time from an arbitrary point A on the positive time 
axis to the first event of {f} that follows A is a random variable with cumulative 
distribution 


(7) F(t; m) = 1 — exp(— mb), m > 0 constant, 


and that any number of such intervals which are nonoverlappling in pairs, form 
a set of independent variables. It follows that the 2N-1 terms on the right in 
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N-1 N—1 
(8) po a a Gy 8 

j= 0 j=l 
form a set of independent random variables. The 7’; in (8) all have the same dis- 
tribution P(7; < t; m) = F(t; m) and the 7; — u have a common distribution 
P(r; — u S t; m, 6) = G(t; m, 6) which must be determined by application of 
the general rule of selection. 

The Laplace-Stieltjes transform is a convenient operational tool for the deriva- 
tion of the distribution of the sum of independent nonnegative random variables. 
Define the transforms 

aw 
¢(s) | lexp(—st)] dF (t; m), 
aw 


| lexp(—st)] dG(t; m, @) 
Jo 


xv(s) fexp(—st)] dH y(t; m, 6). 
Jo 


where the dependence of Hy defined in (4) upon the parameters m and @ has been 
put in evidence. By well known rules 


(10) xx(s) = [e(s)]*[y(s)}— 
By (7) the transform ¢(s) is (1 + s/m)~ so that 
(11) [¢(s)}* = (1 + s/m)~* 


The inverse of (11) is known: 


N at 
‘ m v=1 
(12) F y(t, m) = rN | fexp(—max)]x dx. 
(N) Jo 


In order to find ¥(s), the following lemma concerning part (iii) of the general 
rule of selection is needed. 

Lemma. Let n and v be arbitrary integers and I an arlitrary time interval of 
length t,;. The probatnlity of the occurrence of exactly v events of the sequence 
(f‘") an the interval I is 


(13) P(v) = exp(—m6t;)- (mét;)’ /v!. 


The proof is well known since the distribution of the number of events of 
(f} in I is the compound distribution of the Poisson probability |exp(—mt,)! 
(mt;)”/! for the number yz of events of the sequence (1) in I and the binomial 
probability (¢)@"(1 — 6)*~" for the number v of those u events selected for {f° | ; 
(see Feller [3]). 

Let A denote the random interval length between successive events of the 
|. By (13) the probability that at least one event of {f*"'} occurs 


‘ 


sequence |f-" 
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in the interval (0, t) is 1 — exp(—mté). Thus the conditional distribution of A 
subject to the condition A < u is 


‘1 — exp(—mét) 


° ; <a t < aw. 
(14) P(A S t| A}< u; m, 0) = (1 — exp(—mébu) 


Wt2 u. 


The transform of (14) is 


(15) m6{1 — exp[—(mé + s)u}}/[1 — exp(—méu)]-(mé + 8). 


In the sequence {f‘"’} suppose that exactly k successive events f}”, f2”, «++ , 
(n) ° ° 
fi occur separated successively by times less than u and that f{{} succeeds fi” 


by time u or more. The probability of such an occurrence is 
(16) [1 — exp(—mé@u)}exp(— méu) 


and by the general rule of selection, the conditional distribution of r, — u subject 
to the above occurrence is the k-fold convolution of (14). It follows that the 
transform ¥(s) is the sum over k = 0, 1, 2, --- of the product of (16) by the kth 
power of (15). This summation readily gives 


(17) ¥(s) = (mé + s)-exp(—mé@u)/{mé-exp[—(mé + s)u] + 8}. 


For more detail in the above argument, see Feller [2] where the case 6 = 1 is 
treated. In deriving (17), it was assumed without mentioning that 6 > 0. The 
result persists for @ = O since in that case it reduces to ¥(s) = 1 which means 
that P(r, = u) = 1 in agreement with the Type I rule of selection. 

Inversion of the transform [y(s)]*"'/s gives the cumulative distribution 
Gy_y(t; m, 0) of nx_; defined in (3). The result is 


rt 


n=) 


M ro 
Gy—\(t; m, 6) =exp[—(N — 1)méu] Zz. (—1)” % ‘+ " 


(18) 
} N—1 — [ ES cms ) nr 
- exp(—nméu) >> ‘ ‘ c on 


r 


ra () 
where . is the largest integer such that Mu < ?. 
It is interesting to note that Raff [8] has used a special case of (18) as a waiting 
time distribution in a certain traffie problem. 
By (10), the convolution 


‘ t 
(19) Hx(t; m, 6) = / Fy(t — x; m) dGy_; (x; m, 9) 
J0 


using (12) and (18) gives the distribution function (4). The results of the con- 
volution (19) are so complicated as to be almost useless. In performing the 
integration, it must be noted that Gy_, is discontinuous at ¢ = 0 and that the 
integral is in the Stieltjes sense. Taking these facts into account, it will be con- 
venient for later purposes to write (19) in the form 
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Hy(t; m, 0) = exp[—(N — 1)m@u|F x(t; m) 
+ [1 — exp{—(N — 1)m6u}]Qn(t; m, 6) 


where Qy(t; m, @) is the conditional distribution of ¢y subject to the condition 
that nv. > 0. The acutal form of Qy will not be used in the sequel. The interested 
reader can easily calculate it. 

In the case of a Type I counter, 8 = 0 and the right member of (20) reduces to 
(12). If @ > 0, but (V — 1)mzu is small, (20) is approximated by (12) so that the 
counter behaves nearly like a Type I. Whether or not the accuracy of the ap- 
proximation is sufficient is a matter of judgement for the individual reader. The 
remarks following Theorem 3 in Section 5 below give further aid in such judge- 
ment. The asymptotic results derived in the next section apply to cases in which 
N/mu is large. 


3. Asymptotic percentage points of Hy(t; m, 6). Let p be an assigned number 
in the range 0 < p < 1 and define the percentage point t, by means of 
(21) p = P(tw S tp; m, 0) = Hy(ty; m, 9). 


Exact calculation of t, appears feasible only in cases where Hy reduces to (12). 
Define the parameters 


(22) M=mu, X=N/M. 


In a well designed counting experiment \ will usually be quite large. Under this 
condition approximate normalization of Hy is permissible for the calculation of 
= 

The distribution function of the random variable mfx /A will be denoted by 
H¥(t: m, 6). Its transform is given by 


23) xx(8) = xw(ms/d) 


where (10) is to be used on the right. The cumulants of the distribution H% are 
readily found from (23) on inserting (11) and (17) in (10). They are 


x, = M + (Mr — 1)(r9) *(e”? — 1 — MO), 


= (r — 1)! [an — (Mr — 1)(\0)~ 


; = y 1 f . —_ - 
“Al = 23 -° — (— jM@)"’ e’ |, r 
| u j r-ji° 


(r 


\ 


Let x, and ¢* be defined by 


” *p 9 ” 
p = (2r)" ne exp (—432°) dx = H¥(m, + «27th; m, 8). 


—<eo 


Also, define the corstants I, , r 


(25) h = 
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An application of the formula (6.75) in Kendall [5] gives 
(26) t) = % + ; @, — 1) + zi (x, — 32) — zs (2x5, — 52,) + O(X*”). 
) ) oy 


Only a part of Kendall’s formula has been retained. The interested reader may 
obtain the terms of orders O(A~*”) and O(A7’) on recognizing that Kendall’s 
symbols x and é are respectively the ‘* and 2, of (26). (The reader should be 
warned that the reference formula contains several misprints in the third edition 
of Kendall’s book.) The percentage point ¢, defined in (21) is given by (26) and 


1/2,* 
mt,/>X = Ki + Ko tp. 


4. Confidence intervals for m; Type I counter. The general methods to be em- 
ployed in this section are described in Cramér [1] and Kendall [6]. 

The object of most counting experiments is the estimation of the mean rate m 
of occurrence per unit time of the events of the sequence {f}. The fundamental 
resolving time u of the counter is usually regarded as known; it will be small. 
For radioactivity counters u is of the order of magnitude of 10~* to 10° second 
depending upon the refinement of the counter. It will be convenient to establish 
confidence intervals for the parameter M defined in (22). 

A precise argument can be given establishing confidence intervals for M for a 
I counter. The argument is difficult to make precise for a general type of counter. 

In a great many counting experiments the product (NV — 1)mu will be small 
enough that (20) may be regarded as essentially equal to (12); that is, that the 
counter is of Type I. In this case the percentage point ¢, defined in (21) is given 
by the exact equation 


1 


_ P= sr) 


fF - fexp(—4a)] - (Zx)*™ de. 


It will be recognized that 2mt, is the percentage point of the chi square distribu- 
tion with 2N degrees of freedom. In this case (26) and (27) give 


(29) Mt, = Nu{1 + 2,/N* + (2, — 1)/3N + (x, — 72p)/36N°"} + OX”). 


Clearly, t, is a monotone decreasing function of M; let this dependence be de- 
noted by t,(./). 

Suppose now that the counting experiment is of the fixed count type. Then NV 
is assigned and {fy is the observed random variable. The equation 


(30) tp(L») = fy 


defines a new random variable L, such that, no matter what the true value of M 


is, 


(31) P(L, < M;M, 6) =1—p. 
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Let p: and p2, pi: < p2, be two values of p. By (28) t,, < tp, and by (29) and 

(30) L,, < Ly», . It follows by well known rules of probability theory that 
P(Lp, <M< Ly, ; M, 6) = Pz — pPi- 


This proves 

THEOREM 1. (Type I counter; fixed count experiment.) Let N be an assigned 
integer and p, , pz assigned probabilities with O S pi < pe S 1. An interval of 
100 (p2 — pr.) percent confidence for M = mu is given by 


(32) Ly, <M < Ling 
in which the L, are defined by 
Lytw = Nu{l + 2p/N’ + (x — 1)/3N + (2, — 729)/36N""} 
+007), P= piyPs, 


(33) 


and where {w is defined by the last equation in (3). 

Continuing with the Type I rule of selection (@ = 0), suppose that a fixed time 
experiment is to be used in counting. Since the random variable n(7’) of this 
case is discrete valued, exact percentage point functions analogous to the t,(M) 
defined in (28) are not obtainable. The following procedure is based upon per- 
centage points for n(7') in terms of M so chosen that the confidence intervals 
derived from them have confidence levels greater than or equal to pz. — pi, 
the inequality on the probability being as close to an equality as is possible. 

Assign p; and p2,0 S pi < p2 S 1, and let two functions N,,; (M) , 7 = 1, 2, 
be defined as follows. Let N,, (7) be the smallest integer such that 


(34) Pin(T) = N,,(M);T, M} Ss pr: 
and let NV, (M) be the largest integer such that 
(35) P{n(T) = Ny,(M); T, M} = pr. 


These functions are the analogues of the percentage points t,, (7) used in the 
previous case. It will be shown presently that (29) may again be used. By (6) 
and the Type I assumption, for any integer V = 1, 

P{n(T) = N; T, M} = Fx{T — (N — Du; m} 
(36) 


2m(T—(N—1)u] 
= an [ are exp (— 42x)(4x)*" dr 
2r(N) 0 ve” bea , 


and P{n(T) 2 0; T, M} = 1. Using the integral (36), define two sequences 
MS, k = 0,1, 2, --- , i = 1, 2, by the equations 


M$? = 0,i = 1,2, 


(37) 
pi = P{n(T) 2 k; T, Me’}, 
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It is clear from the integrals involved that for each pair (i,k), Mi” < M&2, and 
for any k = 1, M{? < M?. It follows at once from the definition (34) that 


(38) N,,(M) =kifM21<Ms MP, k = 1,2,3,--- 
and similarly from the definition (35) that 
(39) N,,(M) = kif MP < M < ME), k = 0,1,2,---. 


These two functions are monotone nondecreasing in M; the first is continuous on 
the left and the second continuous on the right. They are the stairstep percentage 
poir.t functions familiar in the theory of confidence intervals for the parameter of 
a discontinuous distribution. 

Define two new random variables as functions of n(7’) as follows: if n(7’) = k, 
k = 0,1,2,---, 


Li,{n(T)] = MP 
Li.{n(T) = Mi. 


Let xz be an arbitrary number in the range Mf, < z < M?”. The inequality 
L*, < x is equivalent to n(T) < k — 1. Thus, by (38), P{L}, < z;T,M} = 
Pin(T) $k —1;T,M} = 1 — P{n(T) 2 N,,(x); T, M} for z in the above 
range. For an appropriate integer k, one may choose x = M whatever M may be. 
Then by (34) 


(40) 


P{L3, < M;T,M}21-p 


whatever M may be. Now let z be arbitrarily chosen in the range Mi” < x < 
M2, . By asimilar argument P{L}, S$ z;T,M} =1—P{n(T) = N,,(x);T, M}. 
Again, using z = M with an appropriate choice of k, (35) shows that 


PiL*, < M;T, a} Si- + 
\ 


regardless of what M may be. It follows that 
P{L3, <M <L3,;T,M} > m— p- 
This proves 
THEOREM 2. (Type I counter; fixed time experiment.) Let an interval of observa- 


tion0? <t S T beassigned and let 0 S pi: < po S 1. A confidence interval for the 
parameter M = mu for confidence level at least 100 (p2 — pi) percent is given by 


(41) Ls, <M <L;, 


where the limits are the random variables defined by (40) and (37). (Results similar 
to this are given for the Poisson distribution by Garwood [4] and for the binomial 
distribution by Kendall [6]. It seems to have gone unnoticed that strict in- 
equalities are obtainable.) 

Practical calculations of the confidence limits (41) are effected as follows. Com- 
parison of the integrals (28) and (36) indicates that an asymptotic formula for 
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MY? is obtained by equating the quantity M\’-[T — (N — 1)u] to the right 
member of (29) with p replaced by p; . Thus the limits (41) may be obtained from 
(33) with the following changes. First replace {wy in (33) by T — (N — 1) wand 
then replace N throughout the resulting equation by n(T’) to obtain L*, and by 
n(T) + 1 to obtain L}, . Example 2 of Section 6 illustrates this computation. 

Central intervals (p; = 1 — pe) obtained from Theorem 1 are optimal in the 
sense described by Kendall, [6] sections 19.10 through 19.12. Whether or not this 
is true for Theorem 2 is not clear to the authors. In Theorem 1 the random vari- 
able ¢y defined in (8) reduces to the sum of N independent quantities 7; , j = 
0, 1, --- , N-1, whose common probability density is the derivative m-exp(—mt) 
of (7) on t > O. The likelihood L = m*exp(— >, 0 T ;) of the 7; satisfies the 
conditions of the above reference. It follows from Kendall’s discussion that the 
central confidence intervals (N/ty)(1 + 2x>,/N*) for m are asymptotically (rela- 
tive to N) shortest on the average in a general class of intervals obtained by use 
of the central limit theorem. These optimal intervals agree asymptotically with 
the results of Theorem 1. 


5. Confidence intervals for m; general counter type. Precise proofs of theorems 
analogous to Theorems 1 and 2 for the case of the general counter model would 
be very difficult. The distribution (19) being regarded as unusable, the entire 
argument must be based upon the asymptotic formulas (26), (27) for the per- 
centage points ¢, of (19). The vague nature of the error estimate in those formulas 
prohibits precise arguments and results. Further formal manipulations for which 
no general error estimate seems possible will be introduced presently. 

Consider the general counter type. The formula (29) used in connection with 
the Type I counter will be replaced by 


(42) Mt, = dulx, + x2” th) 


in which ¢% is given by (26) with the full forms of the cumulants (24) used. In 
this case, since @ > 0, the ratios (25) depend upon M in a complicated manner. 
Indeed, the dependence of the right member of (42) upon M is so formidable as 
to make the formula almost useless for any general considerations. A simplifica- 
tion of (42) will be given below for small values of M. In most well designed 
counting experiments M will be quite small; the experimenter has some latitude 
in the choice of values of u to effect this. The restriction M < 0.1 will not exclude 
many experiments; for example, in radioactivity counting with the best available 
counter, u = 10° second and the restriction amounts to the requirement that 
the source emit no more than 10’ particles per second in the direction of the 
counter. Fairly extensive numerical calculations of ¢,(47) performed by the 
authors indicate that an expansion of the right member of (42) as a series of 
powers of M retaining powers through M‘* gives a simple formula for t,(M) that 
should be accurate to within one tenth of one percent in most cases provided that 
M Ss 0.1; the error decreases rapidly as M tends to zero. The case M > 0.1 will 
be mentioned briefly later. 
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The cumulants (24) are readily expressed as power series in M: 
ke = M(r — 1)!" + (MA — 19)” DS ALY(Me)", 
n=r+l1 
(43) i 


)o-0™, r= 1,2,3,--- 
By formal manipulations of these series one reduces (42) to 
(44) Mt, = {ao + a,M* + a.M* + a;M* + --- } + 0(A") 
where the coefficients a; are given by 
= Nu{l + 2,/N’” + (23 — 1)/3N + (24, — 72,)/36N*"}, 
(N — 1)u6/2, 
_ (N — Iu 


oO 


(a + 2,0/N'*? — 2(2, — 1)0/3N 


+ (132°, — 192,)0/36N*"}, 


= Sani {8/4 + 2,0 /N'* + (3 — 86)0(x, — 1)/12N 
d 


+ [(130 — 12)z, — (199 — 30)x,]0/36N*"}. 


To within the accuracy of the terms retained in (44) and (45), the derivative of 
tp(.M/) will be negative over the entire range 0 < M < 0.1 provided that a, and a; 
are positive and 

(46) a, + 0.2a2 + 0.03a; < 100aq. 


If this is satisfied, ¢,(.7) will be a monotone decreasing function of M in the 
indicated range. For instance, it is easy to show that (46) is satisfied for all values 
of 6inO0 Ss 6S lifz,and N satisfy 1 Ss zy <= N/4. The investigation required 
by a given case is easily made. 

Writing (44) in the form 


—a = tp,M + aM’ + a.M* + a;M‘ + ---, 


standard inversion formulas for power series may be applied to calculate M in 
terms of ¢t, and the a; . The result of inverting the equation ¢,(L,) = ¢» appears 
in (47) below. 

An extension of Theorem | to the general counter is now immediate. Extension 
of Theorem 2 is then accomplished by the substitutions described immediately 
below the statement of Theorem 2. Thus 

THEOREM 3. (General counter.) Let p, and p2 be assigned,0 S px < po S 1. 
If 0 < M s 0.1 and (46) is satisfied, an approximate confidence interval for the 
parameter M = mu for confidence level 100(p2 — pi) percent is given by: 

(i) for a fixed count experiment with count N, the interval (32) with the L,; ob- 
tained from 
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Qa; , asa Qapai + ara Fae dy Ae 
( 2 - 043 e 1 

+ ee tS 

SN Sn Sn SN 


- 3 3 
0A9 ay 


) 
+ 7 + --->+4+ 0”), P = Pi, P2; 


N 


(ii) for a fixed time experiment in the interval 0 < t & T, the interval (41) with 
the L*,, obtained from (47) as follows. First replace ty by T — (N — 1)uand L, by 
L*. Then, for L*, replace N by the count n(T) and for L*, replace N by n(T) + 1. 

If M is small and N is large, one expects each of the quantities a,/¢y in (47) 
to be near the value 7. In such eases the approximation 


(| (| N(N = 1a 
(48) L, = @ = p= j-S 


i 
for (47) will suffice for practical purposes. This will likely be the situation in the 
great majority of counting experiments where confidence intervals are desired 
for the parameter m. Indeed, the term aoa;/¢% in the bracket in (48) will usually 
be quite small compared with unity. These remarks indicate the extent to which 
the TypeI counter assumption is justified ; the examples in the next section should 
clarify these remarks. 

The power series manipulations used in obtaining (44) and its inverse are 
valid for values of 1 greater than 0.1 and the monotoneity condition is easily 
extended. The authors cannot recommend the results for accuracy. The reader 
who may be interested in counting experiments for which M > 0.1 should graph 
t,(M) using the closed forms (24) of the cumulants x, in (26), (27) and (42) for 
values of the various parameters that are of interest. The range of monotoneity 
of t,(.7) will then be evident and inversion of t,(L,) = ¢y is easily performed 
from such graphs. It does not seem feasible to provide a sufficient number of 
such graphs in this paper to cover the multiplicity of conditions that might arise 
in counting experiments. 

The complexity of the distribution (19) bars any discussion of minimum 
average length confidence intervals. 


6. Examples. The three examples given below illustrate the use of Theorems 
1, 2 and 3. In the first example the rate of m = 2 particles per second is typical 
of radio-isotope tracer experiments. The extremely low rate of four particles in 
eight hours of Example 2 might be found in a cosmic ray count while the very 
high rate of m = 2500 particles per second in Example 3 might be found in a 
nuclear physics laboratory. 

In designing a counting experiment it is important to make use of Theorem 1 
whenever possible in order to utilize its statistical and practical efficiency. This 
is brought out in the examples. In both of Examples 1 and 2 it turns out that 
confidence intervals for m are essentially independent of the counter type. Ex- 
ample 2 illustrates the construction used in the proof of Theorem 2 and compares 
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precise results with asymptotic results. Example 3 shows the effects of variations 
in the resolving time u and the sample size N. 

EXAMPLE 1. (Fixed count.) Suppose that the counting rate is expected to be 
about m = 2 particles per second. It is desired to obtain a 95.46 per cent (x), = 
—2z,, = 2) central confidence interval for m using a counter for which u = 10 
second and a five minute observation interval. Fix N = 600 and suppose that 
tw = 290 seconds is observed. Since (NV — 1)M is approximately 0.12, it appears 
from (20) that Theorem 3 should be used. By (45), keeping five decimal places of 
accuracy in the computation of the a,;/ufw, one finds a/fy = 2.06897 
(1.00167 + 0.08163)u, a,/fy = 1.03276u-6, and a2/ty = 0.34425(¢’—0.00334- 6 
+ 0.08176-@), the upper signs being used for L,, and the lower for Ly, . Since u = 
10°, it is clear at once that the bracket in (47) is essentially unity. Thus the 
confidence interval is 1.90354 < m < 2.24132. Note that the result is the same 
as would have been obtained from Theorem 1; this is true only to within the ac- 
curacy of the compuatation. 

EXAMPLE 2. (Fixed time.) Suppose that m is expected to be about 1/120 particle 
per minute and that wu = 10 minute. Fix T = 500 minutes and suppose n(7) = 
4 observed. Here, (20) indicates that the counter may be assumed to be of type I 
so Theorem 2 may be used. By (41), the limits are tel = M? and L}, = M3” on 
M = mu and by (36) and (37), 2M£-[(T/u) — (k — 1)] is that value below which 
100p; per cent of the chi square distribution of 2k degrees of freedom lies. For a 
central 90 per cent interval 2M{?-(5-10° — 3) = 2.733 and 2M{”-(5-10° — 4) 
= 18.307. From these values, 0.00273 < m < 0.01831. The asymptotic result 
obtained from (29) using z,, = —Zp, = 1.645 is 0.00275 < m < 0.01832 which 
agrees very well with the precise result. 

EXAMPLE 3. (Fixed count, high rate.) Suppose that m is expected to be about 
2500 particles per second. Three cases are considered: 

(i) u = 4-10 second, N = 75,000 and ty = 30.1 seconds, 

(ii) u = 4-10°° second, N = 300,000 and ¢y = 120.4 seconds, and 

(iii) u = 4-10°°, N = 75,000 and ty = 30.1 seconds. Examination of (20) 
indicates that Theorem 3 must be used in all cases. As in Example 1, 95.46 per 
cent central confidence intervals are given. Keeping the same order of accuracy 
as in Example 1, one finds for cases (i) and (iii) that ao/fy = 2491.7 (1.00001 + 
0.00730)u, ai/fvy = 1245.8u-@, and a2/fy = 415.3(6 + 0.01460-6)u. For case 
(ii), replace the quantities enclosed in parentheses in ao/fw and a2/fy above by 
(1 + 0.00365) and (6° + 0.0073-6) respectively. Applying these in (47) and 
dropping terms in the bracket there that vanish to five decimal places, the 
results are: 


(i) 2473.5 + 12.20 + 0.560 < m < 2509.9 + 12.66 + 0.50 
(ii) 2482.6 + 12.46 + 0.50 < m < 2500.8 + 12.50 + 0.5 
(iii) 2473.5 < m < 2509.9. 


The value of 6 will certainly be unknown in most counting experiments. Indeed 
6 may be a purely fictitious parameter that should only be used to tie together 
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the extreme cases of the Type I and Type II selection rules. Cases (i) and (ii) 
show a dependence of the limits on 6 that is substantial relative to the length of 
the confidence intervals. Increasing N shortens the interval but has only a very 
slight effect upon the 6 dependence. 


’ 
REFERENCES 

{1} H. Cramé&r, Mathematical Methods in Statistics, Princeton University Press, 1946. 

[2] W. Feuver, ‘On probability problems in the theory of counters,’’ Courant Anniversary 
Volume, 1948. 

[3] W. Fevuer, Probability Theory and Its Applications, John Wiley and Sons, 1950. 

[4] F. Garwoop, ‘‘Fiducial limits for the Poisson distribution,’’ Biometrika, Vol. 28 (1936), 
pp. 437-442. 

[5] M. G. Kenpatu, The Advanced Theory of Statistics, Vol. 1, Charles Griffen and Co., 4th 
edition, London, 1948. 

[6] M. G. Kenpaui, The Advanced Theory of Statistics, Vol. 2, Charles Griffen and Co., 
London, 1948. 

[7]}S. Maumeuist, “A statistical problem connected with the counting of radioactive 
particles, Ann. Math. Stat., Vol. 18 (1947), pp. 255-265. 

[8] M.S. Rarr, “The distribution of blocks in an uncongested stream of automobile traffic,” 
J. Amer. Stat. Assoc., Vol. 46 (1951), pp. 114-123. 





THE POWER OF RANK TESTS! 
By E. L. LeHmMann 


Stanford University and University of California, Berkeley 


1. Summary. Simple nonparametric classes of alternatives are defined for 
various nonparametric hypotheses. The power of a number of such tests against 
these alternatives is obtained and illustrated with some numerical results. Opti- 
mum rank tests against certain types of alternatives are derived, and optimum 
properties of Wilcoxon’s one- and two-sample tests and of the rank correlation 
test for independence are proved. 


2. Introduction. The most pressing need in the theory and practice of non- 
parametric tests at this time seems to be for results concerning the power of such 
tests, particularly those based on ranks. This would provide a basis for com- 
paring the many different tests proposed as well as for determining the sample 
sizes necessary to distinguish significant departures from a hypothesis with a 
reasonable degree of certainty. 

The chief problem one is faced with when investigating the power of a non- 
parametric test is the choice of suitable alternatives. Even in the simplest prob- 
lems the variety of alternatives is so great that it is clearly impossible to consider 
all of them. In the past, investigators have concentrated on alternatives postulat- 
ing normal distributions for the random variables in question. These alternatives, 
which unfortunately are rather difficult to handle mathematically, must, of 
course, be studied if one wishes to find out how nonparametric methods compare 
with procedures based on normal theory. On the other hand, when comparing 
different rank tests, one is no longer tied to normal alternatives, but it would on 
the contrary seem rather desirable to make the comparisons in terms of non- 
parametric classes of alternatives. 

As a specific example, consider the one-sided two-sample problem, and suppose 
that on the basis of samples X,, --- , Xm; Y:1,-°-- , Ys from cumulative distri- 
bution functions F and G respectively we wish to test the hypothesis H: F = @ 
against the alternatives that G(r) < F(x) for all xz. If among these alternatives 
we look for some simple subclasses, parametric theory suggests 


(2.1) G(x) = F(x — a) for some a > 0. 


But under such alternatives, the distribution of the ranks will depend not only on 
a, but also on F, nor, in general, would a be a suitable measure of the difference 
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of F and G. The situation is similar to the corresponding one for normal distri- 
butions with different means and, say, common but unknown variance. 

We shall in the present paper discuss mathematically “natural” nonparametric 
alternatives against which the distribution of the ranks is constant. Once these 
have been defined, it is relatively simple, on the basis of a theorem of Hoeffding, 
to obtain the power of any rank test and also to derive tests possessing various 
optimum properties. 

The classes of alternatives with which we shall be dealing involve arbitrary 
functions for which one must make a definite choice in order to get specific 
power-results. This choice is here made solely on the grounds of simplicity for 
the resulting calculations. We do not, of course, claim that these are the al- 
ternatives that actually prevail when the hypothesis is not true. Rather, it seems 
that where nonparametric methods are appropriate, one usually does not have 
very precise knowledge of the alternatives. What is then required are alternatives 
representative of the principal types of deviation from the hypothesis, in terms 
of which one can study, at least in outline, the ability of various tests to detect 
such deviations. Such an approach is here presented, and the computations are 
carried through for a few examples. However, in order to get a valid comparison 
of such tests as the Wald-Wolfowitz run test and the Smirnov two-sample test, 
for example, much more systematic computation is required. Such computations 
seem entirely feasible and would seem to be a worthwhile undertaking. 

I should like to express my gratitude to Miss E. L. Scott for her help in setting 
up and supervising the computations for Table 1 and to Mrs. M. Vasilewskis 


who carried out these computations, as well as to Mr. H. Wagner and Mr. J. 
Rosenbaum on whose computations Fig. 2 and 3 are based. 


3. The hypothesis of randomness. While we shall be concerned mainly with 
the two-sample problem, it is convenient to present some preliminary considera- 
tions in the more general notation of the hypothesis of randomness. We shall 
here make the assumption, to hold throughout the paper, that all distribution 
functions that we consider are to be continuous. 

Let f; (¢ = 1,---, N) be continuous, nondecreasing functions defined over 
the interval [0, 1] such that f,(0) = 0, fi(1) = 1. Let Z:, Z2,--+, Zw be inde- 
pendent random variables distributed according to cumulative distribution 
functions F,,---, Fy. We shall denote by S(fi,---, fx) the family of all 
(F,,-+-:, Fy) such that F; = f,(F) where F runs through all continuous cdf’s. 
The classes $(f;, --- , fy) for different choices of the functions f; , --- , fy then 
define a partition of the family of all N-tuples (F,,---, Fw) of the kind de- 
scribed. It should perhaps be pointed out that different N-tuples f; do not 
necessarily generate different families of F’s. If f; is strictly increasing on [0, 1], 
a natural normalization would be to ta‘e fi(z) = 7,0 Sa S1.1f (Fi,--- , Fw) 
belongs to the class $(f,,--- , fy) we saall write 


(3.1) F,:F 3: +) SFy = Sit fe: eee ae 
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We shall now show that the distribution of the ranks of the Z’s is constant 
within each family 5(fi , --- , fw). 

Lemma 3.1. If F is a continuous cdf and if the cdf of Z is given by P(Z S z) = 
f(F(2)) where f is nondecreasing on {0, 1] with f(0) = 0, f(1) = 1, then the cdf 
of F(Z) is f. 

Proor. When f(u) = u, 0 S u Ss 1 this result is well known and implies in 
our case that f(F(Z)) is uniformly distributed over [0, 1]. Therefore 


P(f{(F(Z)) < fw) & P(F(Z) & u) & PY(F(Z)) & flu)) 


and the first and third member equal f(u). 

Let us denote the ranks of the N variables Z,,---, Zw by 71, - 
Then we have 

Lemma 3.2. If Z,;,-+--, Zw are independent, the distribution of T,, --: 
is constant within each family S(fi, «++ , fw). 

Proor. Clearly 


P(F(Zi;,) < ++: < F(Ziy)) S P(Ti, = 1,°°° 


> Tix — N) 
S P(F(Zi,) S --- S F(Zix)). 


But the first and third members of this inequality are independent of F and 
equal since by Lemma 3.1 the distribution of the F(Z;) is independent of F and 
continuous. . 

As an immediate consequence of this lemma we have 

THEOREM 3.1. Given any functions fi , --- , fx and any rank test of the hypoth- 
esis H:(Fi, --- , Fw) €S(fi,--+, fw), the power of this test depends only on 
Fi: «++ tFy. That is, if Fy: «++ tFy = Fi: «++ :Fy so that (Fi,---, Fx) and 
(Fi, --- , Fx) belong to the same class &(f;, «++ , fy) the test has the same power 
against these two alternatives. Furthermore, given any class of alternatives 
K:(F,,--+, Fy) «&(f1, °°: , fw) there exists a uniformly most powerful rank 
test for testing H against K. 

Proor. The first statement is just a specialization of Lemma 3.2. Since the 
distribution of the ranks is simple both under H and K, the second statement 
as well as a method of constructing the most powerful rank test follow from the 
Neyman-Pearson fundamental Lemma. 

In order to apply this theorem we require the distribution of (7, , --- , Tw) 
for the (f1, --- , fx) of our choice. The relevant result was obtained by Hoeffding 
({1], p. 88). Instead of stating it here we shall in the next section give its speciali- 
zation for the two-sample problem. 


4. The two-sample problem. Let X,,--- , X, and Y;,--- , Y, be independ- 
ently distributed with edf’s F and G respectively. We wish to test the hypothesis 
H:F = G. The classes S(f;, +--+, fy), in the present case, involve only two 
functions f and g and may be written as F(f, g). To simply our notation, we shall 
assume that f is strictly increasing. Then $(f, g) may be represented by a single 
function g and is given by $(g) = { (7°, g(F))} where the domain of F is as before 
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the totality of continuous edf’s, and where g is a continuous nondecreasing 
function with g(0) = 0, g(1) = 1. 

Let us denote the ordered X’s and Y’s by X“’ < X° < --- < X°” and 
y“ < --- < Y” and the ranks of the X’s and Y’s in the combined sample 
by Ri < +--+ < Ry, and S; < --- < S, respectively. The complete set of the 
ranks is, of course, determined by the ranks of the Y’s alone. We shall assume 
here that the function g is differentiable on [0, 1] with derivative g’. Specializing 
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: theorem of Hoeffding ({1], p. 88) to the present case we find that when F and 
= g(F) are the edf’s of the X’s and Y’s, then 


P(S, ™ $1,°°°*, Sa = 8) = (=) Erg’ (F(Y“"))- a -g'(F(Y“™))] 


m 


where the expectation is computed under the assumption that F is the true 
distribution of both the X’s and Y’s. Since in this case F(Y) is uniformly dis- 
tributed over [0, 1], we get 


l 

2(S, = own ee ne ne i ee 

(4.1) PS: = 81, » Sn = Sn) = + ") Elg’(U"”) g’(U“”)| 
m 


r( t th — ‘ 
where U“*"’ , U“” are the st‘ to s;' order statistics in a sample of m + n 


variables Satcieated uniformly over [0, 1]. 
Since the probability distribution of the ranks can be expressed so simply in 
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terms of g it is seen that the difficulty in obtaining power results for a specific 
alternative is directly related to the complexity of the function g involved. This 
explains why the investigation for normal alternatives has proved so difficult. 
When F and G are two distinct normal edf’s, the function g = G(F~) is not 
particularly easy to handle. 

Consider now the one-sided alternatives G(x) < F(x). To this corresponds 
a function g such that giz) S$ 7,0 Sal. The simplest choice in view of 
(4.1) seems to be g(z) = 2°; k > 1. The associated problem is that of testing 
H:G = F against the alternatives K:G = F*. In addition to mathematical 
simplicity, this choice has the advantage of admitting a simple interpretation of 
the alternatives. Suppose that k is an integer. Then F is the distribution of the 


Bee htt heel is thal 
acd 


1.0, 









a9 


Fic. 1-B 


maximum of k independent variables having distribution 7. Thus under the 
alternative, the X’s have distribution F while the distribution of the Y’s is the 
same as that of the maximum of k X’s. 


In order to give an idea of how much larger the Y’s are than the X’s, note that 
ifG = F*, P(X < Y) = [r dG = k/k + 1. In Fig. 1, we have assumed that 
the distribution of X is given by the densities 


1 —z2/2 


fA@=—aSe*"; fl) =e", Os2; f(x) =1, 0 


2 





lA 
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IA 
_ 


respectively and show the density of f, of Y when G = F* for k = 2,3 and 6. 
In terms of the present frame of reference the distance of the density f; from f; is 
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the same in all three cases, since in each of them /; is the density of the maximum 
of k observations from f; and since further in all three cases, every rank test has 
the same probability of detecting the hypothesis to be false when J; is the density 
of the X’s and f, that of the Y’s. 

It is clear that a similar interpretation of the alternatives * can be given when 
k instead of being an integer is any rational number. Altogether, we may think 
of the class of alternatives G = F*, as a one parameter family of nonparametric 
classes of alternatives. The distribution of the ranks under these alternatives is 
now easily determined from (4.1). 


6 














Fic. 1-C 


For if we put N = m + n, so = 0, Sagan = N + 1, | = O, and unyi = 1, the 


joint density p(w ,---, Un) of U"",--- , U'™ is given by 
N! n 
oe + 1 
(4.2) a IT ie = 
\1 j=0 
II (sina — 8s — Wt? 
t=O 
over the region 0 = w S uw S +++ S Unga = 1. If here we make a transformation 
to new variables V,, --- , V, defined by 


(4.3) Uy = VMi41 °° Vn («= 1,---,n) 
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and put vo = 0, vada = 1, it is seen that the joint density of the V’s is 


N! " &j- \8j4+17%j~- 
Sa eaten II 0% (1 — vit! 
II (sisn — 85 — 7" 
i=0 
over the region 0 S v; S 1,7 = 1,---, so that the V’s are independently 


distributed according to Beta-distributions, that is, as are single order statistics 
from a uniform distribution. 





(4.4) 


TABLE 1 





| m=an=w4 m=n=6 
Test sprit gl Mili ll cece asain 
a(F?) a(F*) a) | (FY) 
tT. | 23 | 33 | 29 A5 
T: ol AZ 28 09 
a 32 49 | Al 64 
T. | 14 | 20 | 17 | 29 
Tt | 15 22 | 21 | 36 
Ts | 19 ae 20 Ad 
Since US?. ... -U%” = V,-Vi- +--+ -V2, we have when G = F* and hence 
g’(u) = ku*, 
kn at 
NS wee. S. = 8) = ————— E(U.. «.. «wy? 
I 1 8) , , ’ § n ’ + n E [(L U ) ] 
m 
e  - 5 kn yy VCs; + ik — 5) 
4.5) = ——, J e(v#) = — J Se te 
(2.8) m+ ") I] EVs") m+ q j=1 r'(s;) 
m m 
I'(sj41) 


T'( 8541 + jk — i) : 
In particular, when k = 2 so that G = F’, 


Qn 
‘a - 


P(S: = 81,+*+,Sn = &) = 7—— 


: = m+n 
(4.6) = 


8(ss + 1)--- (8s, + n — 1) 


(m+ n+ 1)(m +n + 2) +--+ (m+ 2n)° 


Using (4.6) (or more generally (4.5) or (4.1)) one can now compute the power 
of various rank tests against the alternatives in question. One must list the 
sets (s:,-°**, 8,) making up the critical region and then sum the right-hand 
side of (4.6) over these values of the ranks. In this manner Table 1 was com- 
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puted, which gives the power of six different rank tests T; — 7. against the 
alternatives G = F’ and G = F’ at level of significance a = .1. Since the com- 
putation of the exact power rapidly increases in difficulty with the sample size, 
these computations have been carried through only for the cases m = n = 4 
and m = n = 6. 

The above tests are defined as follows. 


T,: One-sided median test. Rejects H when too many Y’s exceed the median 
of the combined sample. See Mood ({2], p. 394) and Westenberg [18]. 

T2: One-sided Wilcoxon test [3], [4]. Rejects when S, + --- +S, is too large, 
or equivalently when there are too many pairs X;, Y; (¢ = 1,---, m; 
j=1,---,n) with X; < Y;. 

: This is the most powerful rank test for testing G = F against G = F° 
(see Section 6). It rejects when S,;(S. + 1): --- -(S, + n — 1) is too 
large. 

: Wald-Wolfowitz run test [5]. Rejects when the total number of runs of 
X’s and Y’s is too small. 

Ts: Two-sided median test ({2], p. 394). Rejects when either too many X’s or 
Y’s exceed the median of the combined sample. 
T;: Two-sided Wilcoxon test [3], [4]. Rejects when 


| 


toga (mtE+}) 


is too large. 
Although it is not shown in Table 1, we mention for later reference also 
T;: Smirnov two-sample test [6]. Rejects when supt, 
Piece Bin 


is too large where F’x,....,x,,(t) and Gy,,...,y,(t) are the sample cumulative 
distribution functions of the X’s and Y’s respectively. 


Of course, not all of these tests are directly comparable. While the first three 
are aimed only at alternatives under which the Y’s tend to be larger than the 
X’s, the fifth and sixth test are designed against two-sided alternatives, and the 
fourth and seventh against arbitrary deviations from the hypothesis. 


5. Large sample power. For alternatives of the type G = F* it also becomes a 
relatively easy task to compute the approximate power of certain rank tests 
using large sample theory. Some results obtained in this way are shown in Fig. 
2 and 3. 

Fig. 2 gives the power of various tests against the alternative g(F) = F” for 
different sample sizes n. The lowest of the four curves (labeled 8) corresponds to 
the run test (the subscripts refer to the numbering of the tests in the previous 
section) and is based on theory not yet completely verified. The next curve, 





RANK TESTS 31 


87, gives a lower bound to the power of the Smirnov test, while the two upper 
curves show the power of the two-sided and one-sided Wilcoxon test. In Fig. 


3 are shown the corresponding curves for g(F) = F’, except that the run test has 
been omitted. 
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The remainder of this section is devoted to a discussion of the formulae from 
which Fig. 2 and 3 were computed. 

For the Wilcoxon statistic U which counts the number of pairs X;, Y; with 
X,; < Y, it was proved by Mann and Whitney [4] that for large samples 
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U_p (Z) 
mn mn 
a 
dt ae 
mn 
is approximately normally distributed when F = G, and the proof of the cor- 


responding fact when F + G was given in [7]. Mann and Whitney also gave 
the first two moments of U as 


z(<) » [r dG 
min 


(5.1) mno’* (<) = = + (m — 1)(A — eg) + (n — DOA — e) 


mn 12 


— (m+n — 1) | 
4 


where 


y=3- | Faa 


a=4-[ Fae, ,=4- fa-arar. 


(Note that the notation used here differs from that in [4}.) 
If G = F* we have 


ds k / reed k / ” l 
IG SS q G = - G” rea 
[Pe b+ 1’ re k + 2’ a 2k +1’ 


a 


and hence on substituting in (5.1) 


(2) 
mn 
ee) k m— | kin -— 
mne (=) = &+1? k 7, + ——_.. r 


The theory of the run test was developed by Wald and Wolfowitz 
{5} and certain extensions were given in [8]. If W denotes the total number 
of runs of X’s and Y’s and if m/n = y it was shown in [5] that when F = 
G, (W/m — E(W/m))/o(W,m) is asymptotically normally distributed. It was 
also proved that when G = g(F) where the derivative g’ of g is continuous and 
positive on 0 < x < 1, then 


W 1 " 
(5.3) E (*) _ 2 | aie dz asm— », 
m 0 y+ g(x) 
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In [8] Wolfowitz stated that the distribution of W is approximately normal 
even when g(x) # 2, and he derived the asymptotic formula 


»( W . [ g'(v* + 9”) 
15° onl ———1.— Se eee me 
“ (7) I (y + Q’)° mm on (y+g')* - 


1 g” 2 . 1 q’ | 
ifety «| Th G+9P"] 

When G = F", then g(x) = 2x‘ and g'(z) = kx‘, and the integrals on the 
righthand side of (5.3) and (5.4) can be evaluated without much difficulty. 


The power of the run test against g(F) = F’ shown in Fig. 2 was computed in 
this manner. Since then it has been pointed out to me by R. Savage that when 


the limit result for 
m m m 


E(W/m) 


(5.4) 


we replace 


by 
2 I g'(x)/Cy + g’(x)) dz, 


the error is of the order 


Vm | EV /m) ~ 2 | g'{x)/(y + g'(x)) az | 


as is seen from (5.4). Thus (5.3) is not enough to guarantee the validity of this 
substitution. However, the numerical results obtained seemed sufficiently inter- 
esting to leave them in, in the hope that a proof of their validity will soon be 
forthcoming. 

The large sample distribution of the Smirnov statistic has not yet been in- 
vestigated when F ~ G. However, it was pointed out by Massey [9] that a lower 
bound to the power can be obtained simply by the inequality 


P(sup | Fx,..--.x,(t) = Gy,,.-..y,(0) | > C) 
t 


= P( | F x, ,-++,Xm (to) _ Gyy,-+-,rq(bo) | = 7 


oO) 


where f) is any particular value of t. If F(x) = x (0 S$ x S 1), G(x) = F*(z) = 2 
and we take for & the point of maximum difference between F and G, we get 
tp = 1/k — 1Vk. Now Fx,....,x,,(to) and Gy,,...,y,(t) are the proportion of 
successes in m and n binomial trials with probability of success equal to F(t) = & 
and G(t) = ts respectively. Thus for moderately large sample sizes 


F’x,,.--,xm(to) = Fy,,.-+,¥,(to) 
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is approximately normally distributed with mean f& — tj and variance 
(to(1 — t)/m) + (6(1 — t5)/n). 
The constant C of (5.5) can be obtained from [10] or [11]. 


6. Optimum rank tests for the two-sample problem. We next consider the 
problem of determining the optimum rank test of H:G = F against K:G = g(F). 
Under the hypothesis all ‘ ke ") possible combinations of s;,---, 8, are 


equally likely while their probabilities under AK are given by (4.1). Thus the 
problem, in terms of ranks, reduces to that of testing a simple hypothesis against 
a simple alternative, and its solution is given by the fundamental lemma of 
Neyman and Pearson. The most powerful test rejects when the ratios of the 
probabilities is too large. Since the denominator of this ratio is constant (in- 
dependent of s,, --~ , s,), this is equivalent to rejecting when (4.1) is too large. 

If we take, for example, g(F) = F° the most powerful rank test rejects when 


8i(so + 1)--- (, +n—1)>C. 


The power of this test is shown against G = F’ and G = F’ in Table 1. 

Since usually one does not have any precise alternatives in mind, it is perhaps 
more interesting to turn the problem around and to investigate what optimum 
properties (if any) are possessed by some of the standard rank tests. This gives 
an indication of the type of deviation from the hypothesis that the test under 
consideration is particularly suited to detect and therefore of the circumstances 
in which the application of the test is appropriate. As one such example, we shall 
discuss here the Wilcoxon test. 

Consider to this end the one-parameter family of nonparametric alternatives 
given by 


(6.1) go(F) = qF + pF’, p+q=l. 


If 8(p) denotes the power of a test against g,(F’) we shall show that the one-sided 
Wilcoxon test among all rank test maximizes §’(0), the slope of the power 
function at the hypothetical point. It is thus “locally most powerful” just against 
the type of alternative we have been considering. 

To prove this result we must consider 8(p). Since d/du (g,(u)) = q + 2pu 
it follows from (4.1) that 


P(S,; = &, -+: , Sn = 8 | p) sagen a + 2pU“”). «+. + (¢ + 2pU)] 
eae oe ae Po ” 2 


m 


and hence that 
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° a. = $8, | Pp) = ja tuk E bP (2U°° = | 
p=0 ( m ) 


t 2 : 
“wan lneeri be} 
m 


Therefore 
(6.2) "0 = E een lnee ri a" 


m 


where the summation extends over the sets (s; , --- , s,) that form the critical 
region. It follows as before from the fundamental lemma of Neyman and Pearson 
that we maximize 6’(0) at a fixed level of significance by rejecting H when the 
right-hand side of (6.2), and hence >- 2.1 8;, is too large. This is the desired 
result. 

The above property of the Wilcoxon two-sample test can be generalized in 
various directions, which we shall sketch only briefly. First, an analogous property 
holds for any test whose region of rejection is of the form 


(6.3) h(s;) + h(se) + -*+ +h(s,) 2 C. 


In any such case one can find a function h* for which the test given by (6.3) 
maximizes the slope of the power function 8(p) against the alternatives g,(F) = 
qgF + ph*(F) at p = 0. A particular example of (6.3) is Mood’s median test 
T; , which rejects when the number of s; exceeding a given constant is too large. 
However, in most cases, and this seems to include the one under consideration, 
the function h* is too complicated to be very enlightening and to warrant the 
tedious computations necessary to obtain it. The existence of h* follows from 
the fact that the s; can only take on a finite number of values so that without 
loss of generality, the function h in (6.3) may be taken to be a polynomial. 
Furthermore, as in the case of the Wilcoxon statistic the test maximizing §’(0) 
against the alternatives gF + ph*(F) is given by the rejection region 


E{h*’(U“”) tere t h*"(U“”)] >C. 


To complete the proof it is enough to show that there exists a polynomial 
h* and constants a > 0, b such that h*(0) = 0, h*¥(1) = 1, h® (u) 2 O for 0 
< u S 1 and Efh*’(U“’)] = afh(s) + 6]. Now from the fact (see (4.5)) that 





EU(U)') = s(s + 1) --- (8 +k—1) | 
: (m+n+1)--+ (m+n+hk) 


it is seen that there exists a polynomial P for which E[P’(U™)] = h(s). Putting 
h*(s) = a[P(s) + bs} + ¢ we need to show only that given any polynomial P 





36 E. L. LEHMANN 


there exist constants a > 0, b, c such that a P(O) + c = 0, a[P(1) + b] +c = 1 
and a[P’(s) + b] 2 O for 0 S s S 1, and this is easily verified. 

Another extension of the above result concerns a problem different from but 
closely related to the two-sample problem. (In this connection see Hemelrijk 
[19}). Let Z,, --+ , Zy be identically and independently distributed with edf M. 
The hypothesis to be tested is that M is symmetric about the origin, that is, 
that for all z, M(z) + M(—z) = 1. If we assume M to be continuous, put 1 — 
M(0) = p and denote by F and G the conditional distributions of Z given that 
Z > 0 and of —Z given that Z < 0, the hypothesis is equivalent to the two 
statements p = 3, F = G. Let m and n be the number of positive and negative 
Z’s respectively, and denote by X,,---, Xm and ¥,,---, Y, the positive 
Z’s and the absolute values of the negative Z’s respectively, in their original 
order of subscripts. Consider now the probability of any particular set of ranks 
of the Y’s under some alternative. Given m and n, this is independent of p and 
is given by (4.1) when G = g(F). In addition, n is a binomial variable with 
probability p of success. Thus we get 
P(The number of Z’s > O isn and S; = 3, +--+ , Sa = Sn) 


= pX(1 — p)"*Elg(U)- +++ -g'(U%)) 
If in particular, one considers alternatives with p = 3, the right-hand side of 
(6.4) becomes 2~” E[g’(U“”)- --- -g'(U“™)], which formally differs from (4.1) 
only by a multiplicative constant. Thus any optimum test of the two-sample 
problem derived on the basis of (4.1) gives rise to a dual one for the hypothesis 
of symmetry. As an example, the Wilcoxon two-sample test which rejects when 
8: + --- +s, > C, under translation in the above manner becomes a test of the 
hypothesis of symmetry also proposed by Wilcoxon [3] and recently shown by 
Tukey [12] to be equivalent to a test proposed independently by Walsh [13]. 
This test is now seen to maximize 8’(0) egainst the alternatives according to 
which M(0) = 3, the conditional distribution of Z given Z > 0 is F and that of 
—Z given Z > Ois qF + pF’. 

As another application of this approach, let us once more consider the two- 
sample problem, but this time with a two-sided class of alternatives. For sim- 
plicity, we take m = n, and we assume that either the X’s are distributed ac- 
cording to F and the Y’s according to gF + pF” or vice versa. Let 8(p) denote 
the power of a rank test against the first of these alternatives and 8*(p) that 
against the second. We shall then maximize the average power 3[8(p) + 6*(p)] 
at p = O. Since it turns out that 8’(0) + 8*(0) = 0 this is equivalent to maxi- 
mizing 8”(0) + 8*”(0). 

From (4.1) we see that the sum of the probabilities of R;} = 1, --:,R,s =n, 
S; = s,°+:, S, = s, under the two alternatives is 


E{(q + 2pV“”) «+» (q+ 2pV)] + El(q + 2pU) --+ (q + 2pU™)] 
=2+ pE| > (2Uu°° — 1) + D> (2U"? — 1)] 
i=1 


i=] 


+p BE(D eV? —- yev’? —-)p +2 CU — (Ue? — d) + ofp’). 


t<) i<j 


(6.4) 
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(84)) (r 4) ~ rst pk ~ 
> ew )+ EV Ba Seay = 8 


so that the coefficient of p is zero. The coefficient of p’ is, except for a constant, 
given by 


4 Zs [E(V“?® Vo) + E(U°® U%?)| — a Z E{(v?? + vir 4 ue? 4. UP} 


t<J i<j 


a=4y *&+UD+rni+0 _, > % tat tf 
ici «= (2m + 1)(2n + 2) izj 2n + 1) 


Using the fact that >> icjs; = Do (n — 1)s; and that > (ri + 8;) as well as 
>. [ri + s2) are constants, the coefficient of p is, except for a constant, 


(6.5) ETS (Lo + (Cr+ Dw - I+ De - ih 


Thus we maximize the average power at p = 0 by maximizing (6.5) or e 


alently 
(2n + 1)7 (2n + 1)7 
Poe at | +[En-" = | 


_ 


+2 (8 -— i + Dri — 0). 
Rather surprisingly this is not the two-sided Wilcoxon test which is given just 
by the first two terms of (6.6). 


This result can be given a slightly different form. Let us write the alternative 
qF + pF’ in the form 


(6.6) 


(6.7) g(F) = (F + 6F’) 


1+¢ + 6 
where we shall be interested only in values of @ close to zero, and where we may 
then consider also negative values of 6. If 8(@) denotes the power of some rank 
test against 6 we may in analogy to the type A tests of Neyman and Pearson 
[14] maximize 8”(0) subject to 8(0) = a, 8’(0) = 0. This will clearly again lead 
to (6.6). 

Such a paramciric approach can be carried further. Consider, for example, 
samples from ? populations Fy, +++ , F, and the hypothesis H:F; = --- = F,. 
‘natives of the form F; = (1/1 + 6,)(F + 6;F’) with 
= 0, we .ur example, maximize the average power over the sphere 
6 for small 6. This is analogous to a formulation given by Wald [15] 

for the normal case, and leads to an extension of (6.6). 


7. The hypothesis of independence. To illustrate the general approach of this 
paper with another example, consider a sample (X,, Y;),---, (Xn, Y,) from 
a bivariate distribution. The hypothesis to be tested is that X and Y are in- 
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dependent. Nonparametric alternatives to this will be defined by means of a 
function h of two variables such that h is a continuous edf over the unit square 
0 < x,y S 1 (so that A(0, 0) = 0, h(1, 1) = 1). A nonparametric class 3¢(h) 
of bivariate distributions is then formed by the totality of distributions h(F, G) 
where F, G are arbitrary continuous univariate cdf’s. Suppose now that the 
X’s are ordered and that in this ordering the rank of X; is R; (i = 1, --- , n). 
Similarly, we shall denote by S; the rank of Y; among the Y’s. We then have, 
analogously to the corresponding result in Section 3, 

THEOREM 7.1. For the distributions of a class #(h) the distribution of the R’s 
and S’s is constant, that is, independent of F and G. 

This follows from 

Lemma 7.1. If h, F and G are continuous and if P(X 
h(F (x), G(y)) then 
(7.1) P(F(X) S u, G(Y) S v) = h(u, v). 

Proor. Let z, y be such that F(z) = u, G(y) = v. Then 

P(X <2z,Y¥ <y) S P(F(X) Su,G(Y) Sv) S$ P(X S2,Y Sy). 
But 
P(X <2,Y¥ <y) = P(X S2,Y Sy) = h(F(z), Gy)) = hu, »). 


Again we can write down the distribution of the R’s and S’s using Hoeffding’s 
theorem. In fact if h’(u, v) = (8°/dudv)h(u, v) we have 


P(R, = 1, °°: Re = 12581 = &,°°° , Sn = Bn) 


= Efh(U’, Ver). --- -n(U™, VO) 


(7.2) 


where U,,--- , Un; Vi, +++, Vn are two independent samples from a uniform 
distribution on [0, 1] and U"*”, V“* are the associated order statistics. It should 
be noted that in (7.2) it is not assumed that either the r; or the s; are arranged in 
natural order. Alternatively we may take r; = 7 and define s; as the rank of the 
Y that is associated with the 7th smallest XY. In this notation only the S; remain 
random and instead of (7.2) we get 


(7.3) P(S, = 8,°°°, S, _ 8.) a = Elh'(u™, Vv“"))- one -h'(U ”. v™)). 


Perhaps the simplest choice for h would seem to be 
(7.4) h’(u,v) = ut; h(u, v) = 4(uv” + wr). 


This corresponds to the family of edf’s H(x, y) = 3[F(x)G*(y) + F’(x)G(y)] and 
can be interpreted similarly to the alternatives discussed in previous sections. 
The observations XY, Y are drawn with probability } each from two bivariate 
populations with independent components. According to one of these X is an 
observation from F while Y is the larger of two observations from G; according 
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to the other the situation is just reversed. A general family of alternatives could 
be obtained in this way, given by 


l 
H(z, y) = LF @e"y), 20 Om=t. 


The distribution of the ranks under such alternatives can be written down on the 
basis of (4.5) and (7.3). However, even in the simplest cases such as (7.4) the 
resulting expressions are quite complex except for very small values of n. 

It seems of interest that one of the best known tests for independence, that 
based on the rank correlation coefficient, possesses an optimum property similar 
to the ones derived in Section 6 for the Wilcoxon and related tests. For let 


(7.4) hp(u, v) = quo + pur’. 
Using (7.2) we see that the probability of R; = r;, S; = s; (i,7 = 1,---,n)is 
E\(q + 4pU"" V“"). ae -(q + 4pU" V™)). 


Differentiating this with respect to p and setting p = 0 we get 


1 _ Mrdted) | _ 4 = 
BLD 1+ 4U°°) | + 0+ pores py Ms 
Thus the test that maximizes the slope of the power function against the al- 
ternatives (7.4) at p = 0 rejects when >> r,s; is too large, and hence when the 
rank correlation coefficient is too large. More generally, if h,(u, vy) = que + 
pgi(u)ge(v), the test that maximizes the slope of the power function rejects when 
g.(U’)go(V"”) is too large. In this manner we obtain a generalization of the 
result connected with (6.3). 


8. Invariance. The definitions of nonparametric classes of alternatives given 
in the earlier sections for various problems may appear somewhat arbitrary. 
The only apparent justification is their success in permitting the derivation of 
results concerning the power of nonparametric tests. Also, it is not clear at this 
point to what problems, in addition to those discussed here, the method is 
applicable and what would be the appropriate classes of alternatives for such 
additional problems. We shall now show that the notion of invariance provides 
an approach to a general class of problems, which in the special cases treated 
in Sections 4 and 7, yields the definitions given there. 

The general concept of invariance due to Hunt and Stein, and presented in 
[16] is not sufficiently broad for our purpose, so that we shall first indicate a slight 
extension of this notion. As usual, we are concerned with a sample space X on 
which an additive class @ has been defined as the class of measurable sets. Let 
5 be a class of possible distributions F of a random variable X over @. We are 
also given measurable transformations r of X onto itself (see [17] Ch. VIII). 
These transformations are such that when X is distributed according to F eS, 
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the distribution 7F of 7X is again in F. The definition of 7F may be expressed by 
the transformation equation 


(8.1) Py(rX € A) = P3p(X € A) for all A € @. 


Since we are not assuming that the transformations 7 or 7 are 1:1, they do 
not necessarily have inverses and cannot therefore be assumed to form a group. 
Instead, we shall assume that the classes $ and § of transformations r and 7 
include the identity transformation and are closed under multiplication so that 
71, 72€S— 772 € §; that is, $ and § are semi-groups with an identity. 

We shall say that a function ¢ defined on & is invariant with respect to § 
if 
(8.2) ¢g(rx) = g(x) for all re S, re X. 


Furthermore, we shall define as maximal invariant, the partition of 9% generated 
by the relation R: 


(8.3) rRx2 


when there exists r € S such that 2. = rx, , that is, the partition generated by 
the smallest equivalence relation ~ closed with respect to (8.3). We note that 
the existence of 7; , 72 such that 


(8.4) 72) = T2Xe2 


implies 7; ~ 22. For then 7,Rr2, , x2Rr2t2 and hence 4 ~ 1% = Tot, ~ 2. 
Lemna 8.1. A function ¢ is invariant if and only if 


(8.5) Ly ~ X2 implies g(x) = (xe). 


Proor. Suppose that (8.5) holds and that x’ = rx. Then x’ ~ x and hence 
¢(x’) = ¢(x) so that ¢ is invariant. Conversely, suppose that ¢ is invariant and 
consider the equivalence relation according to which two points 2, 22 are 
equivalent if and only if ¢(7;) = ¢(x2). This equivalence relation is closed with 
respect to R and hence contains the relation ~ so that ¢(7,) = ¢(z2) implies 
I) ~ me. 

TueoreM 8.1. /f ¢ is invariant with respect to 8 then the distribution of ¢(X) is 
constant over the equivalence classes mod § that is, if F ~ F’ mod § then 


Pr(e(X) € A) = Pp (e(X) € A). 


Proor. Consider the equivalence in the space of distributions according to 
which two distributions F and F’ are equivalent if and only if 


Pr(e(X) € A) = Pr (¢(X) € A) for all A € @. 


We need only show that this relation is closed with respect to the relation FRF’ 
if there exists 7 such that F’ = 7F. But let F’ = 7F. Then 


Ps(¢(X) eA) = Pr(e(rX)€A) = Pre(y(X)¢€ A) = Pr (XX) € A), 


so that F’RF implies F’ ~ F as was to be proved. 
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Consider now a nonparametric problem of the kind studied in the previous 
sections. For a suitable class $ of transformations, the invariant tests (that is, 
those given by a critical function ¢) are precisely those depending only on the 
ranks of the observations. It further turns out that the maximal invariant 
partition in the space of distributions coincides with our partition of the al- 
ternatives into nonparametric classes. Thus in particular, Theorems 3.1 and 
7.1 are special cases of Theorem 8.1. 

We shall now prove the correspondence just indicated in the case of the two- 
sample problem. The analogous result for the hypothesis of independence can 
be proved in much the same way. The sample space X is the m + n-dimen- 
sional Euclidean space of points (21, ++: , 2m, Y1,°**,» Yn). The family ¢ is 
the family of all pairs (F, G) of continuous distribution functions. To define the 
class of transformations, consider the totality of real-valued (not necessarily 
continuous) functions 7 of a real variable which are strictly increasing and such 
that r(+«) = +. The transformations of § are obtained by applying the 
same function 7 to all coordinates zx; and _y;. Since each r is strictly increasing 
the inverse functions r are uniquely defined and their domains may be ex- 
tended to be the entire real line through the condition that 7~* should be every- 
where nondecreasing. This will insure also that + is continuous. With these 
conventions, if a random variable X has continuous cdf F, the variable 7X again 
has a continuous cdf, which is given by 


(8.5) *+F = F(r'). 
It is clear that the maximal invariant with respect to $ is given by the ranks 
of X,, °°: , Xm, Yi, °-:, Yn, that is, by the equivalence 
(x1, va pene » Yn) ~ (zi, esate > Me ae » Yn) 


if and only if the two sets of numbers are in the same order. This relationship 
certainly is invariant, since with respect to a strictly increasing function the 
ordinates are in the same order relation as the abscissae. Conversely, given 
any two such sets of numbers there exists a strictly increasing function taking 
the first into the second. 

What we shall now prove is that the maximal invariant classes in the space 
of distributions coincide with the classes (f(F), g(F)) of Section 4. In fact, we 
shall prove the more precise statement. The relation 


(8.6) F,(rz') = F2(72"), 
GGT) = G(77') 
is equivalent to 
F,=f(F), G,=9(F) 
F, = f(F’), G, = g(F’) 


(8.7) 


where all the symbols have the significance given to them earlier. We first state 
the obvious 
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Lema 8.2. 
(a) Given any two continuous cdf’s F, F’ there exist continuous, nondecreasing 


functions rz’, rz’ such that 
(8.8) F(rz') = F'(r2’). 

(b) Given any two continuous cdf’s F, , G, there exist f, g, F such that 
(8.9) Fi =f(F), Gi =g(F) 


and such that F(x) = F(y) whenever F\(x) = Fi(y) and Gi(x) = Gi(y). 
Suppose now that (8.7) holds and let 7; , 72 be defined so as to satisfy (8.8). 
Then 


Fy(ry') = f(F(r1")) = f(F'(r2')) = F272") 
and analogously for the G’s. 


To prove the converse, assume that (8.6) holds and that f, g, F are defined 
so as to satisfy (8.9) as well as the side condition imposed there. If we put 


(8.10) F’ = F(r3"(12)) 


we have that f(F’) = Fy(ri'(r2)) = Fo(r2'(72)) = Fo and similarly that 
g(F’) = G,. Thus our result is established provided we can show that F’ as 
defined by (8.10) is continuous. 

Since F and 7; are continuous by assumption, discontinuities in F’ can 
result only from discontinuities in 72. Suppose therefore that 7.(z—) = a, 


m(x+) = b. Then 73'(a) = 72'(b) and hence because of (8.6) F,(r;'(a)) = 
F,(17'(b)), Gi(77'(a)) = G,(77'(b)). The assumption following (8.9) then implies 
F(r7'(a)) = F(r7'(b)) and therefore F’(x—) = F’(x+). 
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NONPARAMETRIC TOLERANCE REGIONS 
By D. A. S. FRASER 


University of Toronto 


1. Summary. Nonparametric tolerance regions can be constructed from statis- 
tically equivalent blocks using published graphs by Murphy [8]. In this paper 
the procedure for obtaining the statistically equivalent blocks is generalized. 
The n ‘cuts’ used to form the n + 1 blocks need not cut off one block at a time, 
but at each stage may cut off a group of blocks, the group to be further divided 
at a later stage by a different type of cut in general. An example is given which 
indicates possible applications. 

The results are also interpreted for discontinuous distributions by indicating 
the necessary modifications to the corresponding theorem in [7]. 


2. Introduction. The generality with which nonparametric tolerance regions 
can be formed has been successively treated by Wilks, Wald, Scheffé, Tukey, 
and others in a series of papers [1], [2], [3], [4], [5], [6], [7]. In each case the sample 
space for n observations from a continuous distribution is divided by these ob- 
servations into n + 1 regions or blocks. Subject to mild restrictions on the pro- 
cedure used to divide the sample space, the proportions of the population con- 
tained in these regions have an elementary distribution, a uniform distribution 
over a set prescribed by simple inequalities. Furthermore the marginal distribu- 
tion of the proportion of the population which lies in a group of these regions 
has the Beta distribution. This enables the statistician to choose enough regions 
to make a probability statement such as the following: “In repeated sampling 
the probability is 8 that the region T contains at least a of the population.” 
Graphs for obtaining the probabilities and the number of original regions to 
compound are given in a paper by Murphy [8]. 

In the previous papers the sample space was partitioned by forming a single 
block at a time; this restriction, however, is not necessary. The whole region 
corresponding to n + 1 blocks may be divided into r blocks and n + 1 — r 
blocks; then each of these sets of blocks may be divided by a procedure de- 
pending on where and how the first division was made. The exact statement 
of the possible procedures is given by Theorem 6.1 in Section 6. 

Advantages of this procedure are perhaps illustrated by the following ex- 
ample. A sample of 59 observations is made from a continuous bivariate dis- 
tribution known to have two modes; a 50% tolerance region in two parts center- 
ing on the two modes is desired. From Murphy’s graphs [8] a region formed 
from 36 blocks’ is seen to have a 90% probability of containing at least 50% 


Received 12/6/51. 
1 It is worth noticing that only 60% of the equivalent blocks yields 90% confidence in 
at least 50% of the population. 


44 





TOLERANCE REGIONS 45 


of the population, that is, 90% confidence that the region contains at least 50% 
of the population. The following procedure is proposed as a solution to obtaining 
the required region. 

The 59 points are plotted in Fig. 1. The function y is used to remove two blocks 
by the cut c, ; two further blocks using the function —y are removed by the 
cut c,. Similarly z and —z are used to form cuts cs and cs. The rectangle so 
formed now corresponds to 52 blocks. 

The rectangle is tentatively cut into eight sections formed by the two diagonals 
and the two lines through the center parallel to the x and y axis. For convenience, 
number these sections from one to eight clockwise starting at top center. In the 
first section cut off one block from the outside using a line making an angle of 
— 223° to the x axis, that is, we use the function y + z tan 224° to form the cut 


cy . For the second section use the function y + x tan 674° to remove one block 
by the cut cy. Apply a similar procedure to each of the 6 remaining sections, 
thus forming cuts cy , C12 , Cis , Cu, Cis , ANd Cyg . The region now remaining corre- 
sponds to 44 blocks. 

The eight sections originally were of equal area. Each section has had a block 
removed thus reducing the areas to the values a, , --~- , 43 , say. Further cutting 
will depend on these areas, they being an indication of the relative positions of 
the two modes. Consider the total area of an adjacent pair of reduced sections 
and of the opposite pair; for example, total area equals a; + a2 + a5 + as. Do 
this for each of the four possible selections. From the diagram it is easily seen 
that the group with minimum total area corresponds to Sections 3, 4, 7, 8. These 
are the sections which presumably tend to separate the two modes; hence we 
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divide the remaining region by a line with slope —1. If the blocks had been 
2, 3, 6, 7 we would have used a line with slope 0. The rough reasoning behind this 
cutthroat procedure should be apparent. 

Using the function y + 2, divide the 44 block region into parts corresponding 
to 22 blocks and 22 blocks, that is, we choose the point giving the 22nd largest 
value to the function y + x and make the cut cz. The two regions formed by 
this cut are further reduced with the objective being to form two circular regions 
each corresponding to 18 blocks. 

Use the function (y — 7)’ + (x — &)* to remove four blocks from the right- 
hand region. As center of the circle (¢, ») a reasonable choice would be the center, 
marked ‘z’, of the largest circle which can be inscribed in that region. Cuts 
C39 , Coo , Cox , ANG Cy are made by this function. We apply a similar procedure to 
the left-hand region. 

The resultant two circles form a region 7’ composed of 36 blocks and hence 
in repeated sampling have 90% confidence of being at least a 50% tolerance 
region. It should be noted that the two parts of T will not always be circular; 
they will be circular with perhaps indentations. (See, for example, cut cy.) 


3. Notation. Let W symbolize a probability distribution over a general space 
$ and let w be an arbitrary point of this space. By the coverage of a set S C § 
we mean the probability measure of S with respect to the distribution W, that 
is, P..(S) = P(S) is the coverage of the set S. If the set S is random, then the 
coverage will be random. 


4. Conditional probabilities when a sample is ordered by a real function. For 
the proofs in Section 5 we shall need to know the form of certain conditional 
distributions. In [7] these conditional distributions were assumed without proof, 
here they are more complicated and a proof of their structure is given in this 
section. 

Let ¥(w) be a real-valued measurable function over the space S. For a sample 
of n elements from this distribution of W over S, we wish to determine the con- 
ditional distribution of the sample given that the jth largest value of ¥(w) is 
equal to t. If several values of w; have ¥(w;) = t, the procedure in ordering these 
sample elements is to make each permutation equally likely and select a permuta- 
tion at random. That the conditional distribution exists is easily seen from the 
fact that in a product space with a product measure the conditional distribution 
of n — 1 coordinates obtained by conditioning the remaining coordinate is just 
the marginal distribution of those n — 1 coordinates. 

Consider (w; , --- , Ww.) as @ point in the product space [] 7.15; , where each S, 
is identical to S. The probability measure over this space will be the power 
measure of the given measure of W over 8. We partition the space §$ into disjoint 
sets as follows: 


$= Su3uS, 


where 
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{w|¥(w) > ¢}, 
{w|¥(w) = ¢}, 
{w| p(w) < ¢t}. 


The conditional distribution we wish to obtain will be a distribution over a 
subset of the following region in the product space: 


n (J-1 n ) 
x=U iis. x3, IT s>. 
j=l | i=l ey 


t=j+ 
The components of this union are not disjoint. Consider the following decom- 
position into disjoint sets: 
X= Xiu X2u ree uxs, 
where 


X, ~ p= | Lt .-3)x3x Tl @, —3d\, 


t—j+1 ) 


Xo; == U { U (3; x Se x (8; v7 3,)}}, 


j=l kel; tel j—(k) 


( 


Xeni; = U ( U ls 4 I i, X II (8; — x], 


fel (kiss <hpelj ie] j—(ky,-++,kr) 


} 


n 


X= U (3; x [1 5;} 
j=l tel; 
iI, 


and J; = (1,---,j7 -1,j +1,°-+-,m). The sets X,, --- , X, are disjoint but 
the components of each X; (¢ = 1 excepted) are not disjoint. 

The advantage of this decomposition is in the symmetry possessed by the 
sets X,; . X,; is symmetric in the n — 1 coordinates other than w; ; also X;; is 
identical to X;,; if we consider only the n — 1 coordinates remaining after de- 
leting respectively w; or w;. Therefore consider the following decomposition: 


X = Xu Xue u X", 
where 
B n 
X'= U Xy, J=l,-e+,n. 
inl 


The sets X’, X’, --- , X" are symmetric in the n — 1 coordinates obtained 
by omitting w; , w2, --- , W, , respectively; also as far as these n — 1 coordinates 
are concerned the sets are identical. However, the sets are not disjoint, but 
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overlap in a simple manner indicated by the assignment of probability measure 
below. 

Since X is composed of n identical components as far as the ‘“‘n — 1 coordinates” 
are concerned, we can treat any one of them, say X’. Assign a probability measure 
to X' as follows: Let Pf be the measure such that the measure over Xj is the 
original power measure over S$", the measure over X»; is the original power meas- 
ure over S” reduced by the factor 3, the measure over X,; is the original power 
measure over §” reduced by the factor 1/7, and the measure over X,; is the 
original power measure over S” reduced by the factor 1/n. When the n identical 
cases X*, --- , X" are compounded the original measure over X is reproduced; 
the sets overlap in such a manner that for points with measure reduced by the 
factor 1/7, 7 sets overlap reproducing the original measure. If the probability 
measure of 3 is equal to zero, we need only consider the first set X,. Its measure 
is zero, but if we consider the n — 1 coordinates, w deleted, the marginal measure 
gives us the conditional measure of those n — 1 elements of the sample. This 
particular case is covered again after the following simplification of the gen- 
eral case. 

Since we are interested only in the n — 1 coordinates and since the distribu- 
tion is homogeneous with respect to the nth coordinate, we now work in the 
product space of those n — 1 coordinates. The measures P? will be altered by a 
factor if P(3) + 0; otherwise we have the particular case mentioned above and 
the conditional distribution is identical to the marginal. Therefore, letting 
I = (2, --- ,n), we have 


ieI—(k) 


X,=U(%x II (8-30, 


X, = II 33 
tel 


and P* is the measure such that the measure over X, is the original measure 
» Rl m a -" 
over S", the measure over X; is the original measure over $” reduced by the 
factor 3, the measure over X, is the original measure over 8" 
factor 1/n. 
The conditional distribution for which we are looking is obtained from the 


reduced by the 


n 
distribution over the space Y C U X,, where 
j=l 


\ 


Y= U TI (S,u3,) X Il (s,u 3}, 


\i=2 t=j+1 
and the union is over all combinations P of 7 — 1 integers chosen from the n — 1. 


These 7 — 1 integers index the coordinates for which the projection is $ u 3. 
But since we are interested only in the distribution of the 7 — 1 coordinates 
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“above” and n — j coordinates ‘‘below”’ with respect to the ranking introduced 
by ¥(w), we introduce the following decomposition of Y: 


j-1 n—j 


= U U Y,., 


r=0 s=( 


where 


Yo = u{ITs, x Il gh at UYeo, 


imj+1 ) 


Yn = U4) Y, [I3. x II, 3.| x ly, Il 3.x II s:|} 


t=2 t=r+ i=j+l t=j+st+l1 
>? 
U Fics 


. y! 
Yjn-j; = u{II Ji bi Y j-1.9-3; 
and P?~ indicates that the union is taken over all combinations of r integers 


chosen from 7 — 1 integers (the set corresponding to a particular combination is 
placed after U). 


- 1 
identical cases as far as the 7 — 1 coordinates “above” and n — j coordinates 
“below” are concerned. A typicai one of these cases is: 


Using this decomposition, the distribution P* can be broken into (" ‘ ') 


jo-l n—j 
, yl 
Y! = U Few 
r=nQ gaa 
° gs . 2 
with measure P** defined so that the measure over Yoo is the original measure 
—1 <4 oe 
over 8" (the P* measure), the measure over Y;, is the original measure over 


$s" reduced by the factor 1 / (r+s+1) Pp 7 *) (the P* measure reduced 
r+s rl : _ 
by the factor 1 / ( ; )), the measure over Yj_;,,-; is the original measure 


over $”' reduced by the factor 1/n t im 7 (the P* measure reduced by 


the factor 1/ (" iy ')): This measure when compounded with the other simi- 


lar permutations is easily seen to reproduce the measure of P*. 

Thus the conditional distribution of the sample given that the jth largest 
value of ¥y(w) = ¢ (ties broken by equally likely random choice) is that of 7 — 1 
coordinates %, W;,---, @;, nm — j coordinates wj,;,---, Wa, and one co- 
ordinate w* with distribution as follows:( w*, @.,--- ,@;, wyi1, +++ , Wa) takes 
its values in 3 X Y’ with probability measure obtained by normalizing the 
product measure of the original W measure truncated to 3 and the measure 
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P** over Y’. The distribution is symmetric with respect to @ , --- , @; and with 
respect tO Wji1,°**, Wn- 

This conditional distribution is not that of samples of 7 — 1 and n — j from 
truncated portions of the original distribution except in the particular case in 
which 3 has measure 0. However, we can express it in terms of samples of 7 — 1 
and n — j (with the accompanying simplicity of structure) by the procedure 
described below. 

We replace the original distribution W by the combination (W, U) where U 
is a uniformly distributed random variable [0, 1] and is independent of W. For 
any value of ¢t for which Py(y(W) = t) ¥ 0, record along with the function ¥(w) 
the observed value of U. For a sample of n from the distribution (W, U), we 
order the elements according to ¥(w) and if there are ties we order them accord- 
ing to the corresponding values of wu. 

We now find the conditional distribution of the sample given that the jth 
largest value of (¥y(W), U) is (t, u). The theory at the beginning of this section 
immediately extends itself even though the combination (y(w), uw) is not real- 
valued. Since P(y(W) = t, U = u) = 0, the distribution of the n — 1 elements 
is that of (1) a sample of 7 — 1 from the distribution having relative measure 
P{ which over § is the measure of W, and over 3 is the measure of W reduced by 
the factor (1 — u), and (2) a sample of n — j from the distribution having rela- 
tive measure P; which over § is the measure of W, and over 3 is the measure of 
W reduced by the factor w. 

We now show that the previous conditional distribution is obtained from 
this distribution of samples of 7 — 1 and n — j. The conditional distribution 
given that the jth value has ¥(w) = ¢ is obtained by taking the marginal dis- 
tribution over U; it is easily seen to be a distribution over the space Y’ defined 
above. We now evaluate the measure for this distribution. 

The marginal distribution of the u coordinate for the jth largest value is 
needed. For continuous distributions the jth largest value has a mapping of the 
Beta distribution with parameters n — j + 1 and j. However, the mapping is 
linear for the portion in which we are interested and therefore has as density 
function 


K-[P(8) + (1 — u)P(3))""[P(s) + uP()]"”. 


The relative probability measure over Y’ is obtained by considering the different 
subsets Y}, . The measure over Y?, given the u value for the jth largest element 
is the original product measure over $"~ reduced by the factor 


(1 — u)rut 
[P(s) + (1 — wu) P(S))*"[P(s) + uP()"*" 
The factor in the denominator obtains from the normalization of the distribu- 


tions of samples of 7 — 1 and n — j from Pj and P; . Taking the marginal dis- 
tribution with respect to u, the factor becomes 
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— du 


[ (1 — u)"u'K[P(s) +(1 — u)P(3))""[P(s) + uP(3)|""’ 
0 [P(S) + (1 — u)P(3)]""[P(s) + uP(3))"” 


I(r + Dr(s + 1) = 

r 8 ik, aetna eee 
~K To +e+2) w+e+n("F*) 
The constant K is present to normalize the distribution over Y’. Thus the model 
in terms of samples of 7 — 1 and n — j reproduces the conditional distribution, 
since the marginal distribution over U of the samples of 7 — 1 and n — j repro. 
duces the distribution. P**. 


5. Definition of the Blocks for the continuous case. In order to form blocks 
with the generality described in the introduction, we not only need a sequence 
of functions which describe how the cuts are made but also a sequence of in- 
tegers to indicate through which point each cut is to be made. The procedure 
no longer produces n + 1 blocks one at a time, but rather n divisions or cuts are 
made which successively divide the space into sets, each set being equivalent 
to the union of a number of blocks. 

Consider n points w,; , w2, --- , w, in the space S. To define the n + 1 blocks 
the following two sequences are needed; the first, 


¢i(w), g2(w | ¢1); ~~ ae ¢n(w | as°°°s €n-1); 


is a sequence of real-valued functions over S$; the second, 


Pi, p2(¢:), P3(¢, ’ ¢2), ae Pi(¢1 ree Pn-1); 


is a permutation of the integers (1, 2, --- , n). In each case the elements of the 
sequences depend as indicated on values of previous elements in the first sequence. 
The value of ¢ to be inserted in the functions at any stage is described in the 
definition 5.1 below. 

For a set of real numbers (x; , --- , 2) we define max} x; to be the rth largest 
value in the set, and i(max’ x;) to be the integer i for which z; = maxj z;. If 
i(max} x,;) is not uniquely determined, the context in which the symbol is used 
will indicate which of the available integers is to be chosen. 

DEFINITION 5.1. The set (wi, We, +--+ , Wa) of points in § and the functions 
\o.(w\er, +++ ,¢i-1)} and {pi(gi, --- , ¢i-1)} define a partition of § into disjoint 
blocks S;, +--+, Sana: and cuts T,,--- , T, as follows: 

(i) For the first stage, 


= {w|¢i(w) = ¢t'}, 


{w|¢i(w) > gf}, 


{w\¢i(w) < ¢f'}, 
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where gi! = max" ¢i(w;), and i(max”! ¢,(w;)) = i(ef') is any one of the in- 
aos 


tegers available. 8,(S; p), 8:(T; p) are used to designate for the S’s, T’s respectively 
the index sets which contain p and over which the set union operator is applied. Corre- 
sponding to each 8;(T; p) there is a set of integers y:(p) which index the points w; 
to be used for the cuts indicated by 8:(T; p); if p > pi, < pi then y:(p) consists of 
nN — Pr, Pr — 1 integers j for which ¢\(w;) S ¢i', 2 ¢i'. 

(ii) For the second stage, 


T oxop1) = {w| ¢2(w| gi") = o?*, Z(w)} 


1 a P2 = 
PS p2(¢)!) &,U p< p2(¢7!) T, (w | e2(w 171 ) > v3 » =(w)} 


peB; (8; p2(¢)!)) peBs(Tip2(est)) 


| Pi P? =, 
P> p2(¢%!) ” P> p2(e4!) (w| ea(w | ¢1 ) “e » E(w)} 


peBs (8; p2(¢4)) peby (Ti p2(et!)) 


where gf? = max?3,¢5) ¢x(ws | ¢2"), pl = pale?") — min B(T; pale!) + 1, 
and i(gz*) is any one of the integers available. B2(S; p), 82(T; p) are used to designate 
for the S’s, T’s respectively the smallest index sets, containing p, over which the set 
union operator is applied. =(w) in each case stands for the condition that the points 
of the sets being defined at that stage conform to restrictions imposed on the points 
of those same sets at the earlier stages. Thus in effect =(w) stands for all the inequali- 
ties at the previous stages which apply to points of the sets under consideration. 
(iii) For the rth stage, 


T orto?t--92t51) = {wl gr(w| gi", +++, ertr’) = or"; E(w)} 


S,u 
PS Prleyls-++ePt51) <5 P< Prletl.-++,e?75)) T° 
peBr—1(Sipr(el,-++.eP51)) peBe—1(Tipr(etts*.eet5!)) 


= {wl ¢r(w|¢r',-::,¢rti') > oF"; E(w)}, 


P> Pr(etts:+*ee241) 


pepr—1(T; pr (etl. ePo51)) 
. Pom Pel nme 
T; as {w | ¢r(w |e’ on Pro) < ¢, ; =(w)}, 


where ¢7’ = mMaXx3-4,-1(p-) (Ws | ¢?!, +++, g2i')and pr = p,(y?', +++ , e227") 
— min§,_;(T; p-(¢f' , --- , e2Zr’)) + 1. 8(S; p), 8-(T; pare the smallest index sets 
for the S’s, T’s respectively which contain the integer p; the index sets considered are 
those over which the indices of the unions in stages 1 to r range. Corresponding to each 
8,(T; p) is a set of integers y,(p) which indexes the points w; to be used for the cuts 
indicated by BT; p)ty-(p) is identical to yr1(p) unless B,1(T; p) contains 
pr(gf',--+ 227"); in which case B,,(T; p) ts partitioned into (p,(¢g?! , «++ , e2ir")) 
and two 8,(T; p)’s, and correspondingly yr-1(p) is partitioned into i(g?") and two 
yr(p)’s according as ¢,(w;) = g?", 2 ¢?", S gr’. E(w) is as defined in stage (ii). 
The procedure for the rth stage is applied for r = 2,---+,n. 


’ 
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6. General results for the continuous case. 


THEOREM 6.1. If ei(w | ef? Mat ¢?i') hasa continuous conditional distribution 
for all values of gf’, --- , Pi Phare for a set of probability measure (W) zero 
and for all i, and if for a sample of n (w; , --+ , Wn) from the distribution W, blocks 
Si, -++, Sasa are defined according to Definition 5.1, then 

(i) The blocks are random, 

(ii) The coverage (c,,--- , Cn4i), where c; = Pw(S,), are random and have a 
uniform distribution over the region in R"** defined by the linear conditions 

n+1 
2, = 1, c 20 (G@=1,---,n+1). 

An example of the uniform distribution reverred to in the theorem is the 
distribution of U,, U2 — Ui,+-:, Un — Uni, 1 — Un, where U;,---,U, 
are the order statistics of a sample of n from the uniform distribution (0, 1). 

The proof outlined below does not presuppose the results obtained in the earlier 
papers [1] to [7] in the References, but rather it follows directly from the results 
of Section 5 and a simple probability mapping. 

Proor. For a sample of n from the uniform distribution [0, 1] we define cover- 
ages C1, °*, Cate as follows. 


‘ 
cq = Ui, 
‘ 
C2 = U,— U,, 


Catt — l= Us, 


where U,, --- , U, are the order statistics of the sample. It is well known that 
these coverages have the distribution described in the theorem. 

We consider the density of this distribution as a product of conditional and 
marginal densities. It can be written as the marginal distribution of )>?'c; = C>, 
whic h is a Beta distribution, multiplied by the conditional distribution of 
C1,°*:, ¢» , and the conditional distribution of ¢y,41,°°°, Cnsi. These two 
conditional distributions are independent. The first isthe distribution of coverages 
(each reduced by the factor C,,) obtained from a sample of p, — 1 from the 
uniform distribution [0, 1]. The second is that of coverages (reduced by 1 — C,,) 
obtained from a sample of n — p; from the uniform distribution [0, 1]. Similarly 
each of these conditional coverage distributions can be written as a product of a 
marginal and two conditionals, and so on. These results are immediate conse- 
quences of the geometry of the region in R"™’. 

Consider now a sample of n from the uniform distribution over [0, 1)", the 
unit cube in R”. Order the sample with respect to the first coordinate u, from 
the smallest to largest and pick the p,th point. The marginal distribution of 
the first coordinate of this point is that of C,, in the previous situation. It is easily 
seen that the conditional distribution of the remainder of the sample is that of a 
sample of p, — 1 from the uniform distribution over [0, 1]”, the first coordinate 
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(and any derived coverages) being reduced by factor C,, , and of a sample of 
n — p,; from the uniform distribution over [0, 1]" the first coordinate being 
reduced by the factor 1 — C,, . Similarly each of these conditional distributions 
can be broken into a marginal and two conditionals and so on. The important 
thing is that using the conditional approach the coverage distributions are 
easily seen to be the same in the two somewhat different situations. 

The coverages referred to in the theorem can be obtained by a mapping from 
the coverages in the latter case above, and since those coverages have been 
shown to have the distribution described in the theorem, the proof follows. The 
mapping referred to is the following. 

¢i(W) has a continuous distribution. Consider a monotone nonincreasing func- 
tion g:(u:) such that gi(U) has the same distribution as ¢,(W) when U has the 
uniform distribution (0, 1]. Apply this mapping to the second situation above; 
the region having u; < C,, maps into the set of points of ¢:(w) which corresponds 
to UY'S; (the T; have probability measure zero). Thus the marginal distribution 
of C,, is identical to that of >>?'c; . The conditional distribution of the remainder 
of the sample given that >>?'C; is fixed (max?'y,(w,) fixed) is by Section 4 that of 
samples of p: — 1, n — p; from the original distribution restricted to points for 
which ¢;(w) > max”'g;(w,), gi(w) < max”'g,(w,). Since we are left with the 
probability distribution of samples of p; — 1, n — p; for which any derived 
coverages are reduced by the factor C,, , 1 — C,, , we can split up these condi- 
tional distributions just as we did the original distribution. 

At each stage in the formation of the blocks S; , we define a mapping froma 


coordinate (taken in the order uw, --- , Un) of [0, 1)" to the range of the ¢ being 
considered. The mapping is chosen to reproduce the required conditional dis- 
tribution of that y. Thus the splitting up by successive cuts of the space S$ corre- 
sponds to splitting by cuts in [0, 1]”. The successive cuts are made parallel to 
the coordinate planes u, = 0, uw. = 0, --+ , u, = 0. Thus the distribution of the 
c’’s is reproduced in the c’s by using conditional distributions successively, 
corresponding to the steps in which the cuts were made. This completes the proof. 


7. Main results for the discontinuous case. The results for the discontinuous 
case correspond very closely to those given in [7]. However, for the proof of the 
main theorem the results of Section 4 in the present paper are needed and can 
no longer be assumed to be sufficiently obvious. The mappings from the uni- 
form (continuous) distribution to the discontinuous distributions supplies the 
necessary randomization at cuts having finite probability. This permits the con- 
ditional distributions to be used in the form of samples from a truncated dis- 
tribution. The proof as a whole follows the pattern set in Section 8 of [7] with 
the sort of modification indicated by the continuous case proof in Section 6 of 
this paper. 


The definition of the m-system of functions carries over. The dependence of 
any ® on other ®’s with smaller subscripts is through the values of those #’s at 
their respective cuts. 
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ie 


The definition of the blocks, cuts and coverages is as given in Section 5 of this 
paper with the following modifications. 

(i) The ¢’s are replaced by ®’s. 

(ii) When more than one point falls on a cut the point should be chosen at 
random, perhaps most conveniently in such a manner that each point has the 
same probability of being selected. 


(iii) The closed blocks S,, --- , Si4i are defined by the expressions for 
Si, +--+, Sazi where throughout < is replaced by S and > by 2. 

(iv) Definitions (7.4) and (7.5) in [7] are used to define block groups and 
coverages. 

The main theorem for the discontinuous case reads as Theorem 8.1 in [7] 


with the obvious modifications as indicated by the theorem for the continuous 
case in the present paper. 
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SEQUENTIAL SAMPLING TAGGING FOR POPULATION 
SIZE PROBLEMS! 


By Leo A. GoopMAN 


The University of Chicago 


Summary. Let P be a finite population which may have some subsets of known 
size but whose total size N is unknown. We shall consider the problem of point 
and interval estimation, tests of hypotheses, and fiducial distributions of N for 
some sampling-tagging procedures. The problem of estimating the number of 
classes in a population [1], [2], when it is known that the same number of elements 


is contained in each class, may be considered within the general problem 
discussed. 


1. Introduction. We shall be interested in the following sequential sampling- 
tagging procedure S(L, n;). Let {n;} be a sequence of positive integers and let 
S(L, n;) denote the procedure whereby: 

(1) n; elements are drawn at random from P, the number of elements which 
are drawn from subsets of P of known size is observed, the sampled elements 
are tagged so that they may be distinguished from the remaining elements, and 
replaced in P; 

(2) m2 elements are drawn from P, the number of tagged elements and ele- 
ments drawn from subsets of known size is observed, the sampled elements are 
tagged and replaced in P; 

(3) «++; 
this procedure is discontinued when a total of at least L > 0 tagged elements or 
elements from the known subsets have been drawn. 

The following practical cases are instances of this general procedure. It is 
well known that the decennial census is not complete. One would, therefore, 
like to estimate the total number actually living in the United States; that is, 
consider the problem of finding out how many people were not enumerated in 
the census. We would draw a sample of people and investigate how many people 
in this sample had been enumerated. We would then list the nonenumerated 
people in this sample, draw another sample and investigate how many people 
in this second sample had been enumerated in the census or had been listed 
in the first sample, ...; this procedure is discontinued when a total of at least 
L people have been found who had been enumerated in the census or listed in 
one of the preceding samples. 

Suppose one wished to estimate the total population size two years after the 


Received 2/13/51, revised form 8/15/52. 

1 Most of the results described in this paper were presented to the Summer Seminar in 
Statistics at the University of Connecticut, August 22, 1950, and to meetings of the Insti- 
tute of Mathematical Statistics at Chicago, Illinois, December 27, 1950. 
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census had been taken. One might divide the population into those people who 
were enumerated in the census and are still alive, those born since the census 
was taken, and then the remaining people. Death records might be used in con- 
junction with census records to determine how many people who were enumer- 
ated are now dead. Using these results in conjunction with birth records we 
might proceed as before—sampling, investigating, listing, resampling 

In some areas the number of people in some classes of population (e.g., certain 
professional groups) is known. We could then use this information and a sam- 
pling, investigating, listing, resampling, ... procedure to estimate the number 
of people in that area. One might, of course, wish to estimate the total population 
size of an area by these methods without resorting to any previous information. 

The question concerning the number of animals of a given kind in a specified 
region is often of importance in ecological work. One might set traps in the area 
to catch these animals. When the traps are filled the animals are tagged and 
released, and the traps are moved and reset. The number of tagged animals 
appearing in the traps is then observed, the newly trapped animals are tagged, 
released, and we again move and reset the traps, .... 

Marking methods are also used to estimate insect and fish populations. This 
paper studies procedures and estimates which may be of use in such problems. 

Another case which is of some archeological interest deals with the problem 
of determining how many days there were in the calendar of some ancient 
civilization. By observing the days marked on gravestones and dealing with 
these days as though they were samples drawn at random from the total popu- 
lation of N days in the annual calendar, we might estimate N by observing more 
graves and taking note of days which occur more than once (marked elements 
reappearing). , 

In this study, we do not allow for population changes occurring during sam- 
pling nor for nonrandom sampling. 

If S(L, n,) or a nonsequential fixed number of random samples procedure is 
applied, it is found that nondifferentiated tags (similar tags for each sample) 
are sufficient for the estimation problem and that the general problem may be 
reduced to the nondifferentiated case where P contains no subsets of known size. 
Given S(L, n;), there exists a minimum variance unbiased estimator (m.v.u.e.) 
of N which may be determined as the quotient of two determinants and simpli- 
fied, by combinatorial methods, in special cases. Tables which shorten com- 
putation appear in [3]. 

If {n;} is bounded, as N approaches infinity, the limiting distribution of ?/N, 
where ¢ is the total number of elements drawn before the procedure ceases, is 
x with 21 degrees of freedom. Calculations indicate that the exact distribution 
differs only slightly from x” when N = 365, n; = 1, L = 1. Using the x° approxi- 
mation to the exact distribution, we find that the asymptotic m.v.u.e. of N is 
t/2L, and that approximate one- or two-sided confidence intervals may be 
obtained. The approximate fiducial distribution of V is given and certain tests 
of hypotheses concerning sizes of one and two populations are considered. 
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It is shown that any sequence of successive independent repetitions of the 
sequential procedure, S(L,, n;), S(Le, n;) --- may be improved upon by a 
single S(L, n;). The S(L, n,) also compares favorably with other procedures 
considered. 

A numerical illustration of the procedure S(L, n;) is also presented. 

A list of references to discussions of the various practical considerations in- 
volved in sampling-tagging programs and of certain results thereof appears in 
[4] and [5] where other methods of sampling-tagging are analyzed. 


2. Theorems on sufficient statistics and the existence of unbiased estimators. 
Let P be a set of N elements and suppose that a notion of equivalence has been 
defined so that the elements of P may be said to belong to T' + 1 different classes 
P(O), P(i), P(2),---, P(T). The number of elements in P(j) is N()), 
dojo N(j) = N, where the value of N(j) is known for all but one class, say P(0). 
Let K samples of n; elements, i = 1, 2,--- , K, be drawn in order from P in 
such a manner that, before the 7th sample is drawn, the n,_; elements appearing 
in sample 7 — 1 are so labeled and then replaced in P. (Henceforth, this pro- 
cedure shall be designated by F(K, n;).) Suppose h denotes the history of an ele- 
ment; that is, h is an index denoting the set of tags which have already been 
placed on an element. Let n;(j, h) be the number of elements from class P(j) 
which appeared in the 7th sample with the set of tags corresponding to the hth 
history ; n;(j, 1) being the number of elements from class P (j)which first appeared 
in the ith sample. We have 

THEOREM 1. If the sampling-tagging procedure is F(K, n,) then the statistic 


> n; (0, 1) is sufficient for estimating N(0). 

Proor. A somewhat stronger result holds; namely, if Pr{n.(j, h); N(j), ni, K} 
denotes the joint probability function of all possible n;(j, h) (not for a fixed 7 
and h), then 


By, as (é ) A(k) 
(1) Prini(j,h); N(j), ni, K} = c\d n (0, 1), N(0)> a 


wes /  [] c{n., N} 


t=] 


where henceforth, C{a, b} = b!/(b — a)!, and where A(K) is a functon of 
ni(j, h), ne(j, h), --- , ne(j, h), m, me,+++, me and N(1), N(2),---, N(T). 
Equation (1) may be obtained by direct calculations of the probabilities by 
means of standard combinatorial methods. The theorem then follows from the 
factorization conditions for sufficient statistics. 

By Theorem 1, we see that for F(K, n;) no information is gained by tagging 
elements appearing fram P(1), P(2),--- , P(7’) nor by using different tags for 
different samples. 

Suppose we consider the general class of procedures where random samples 
are drawn from P in such a manner that if it is decided that an ith sample is to 
be drawn, the n,;_, elements of sample 7 — 1 are first so labeled and replaced in 
P before the ith sample is drawn. The total number x of samples drawn will 
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be determined by some stopping rule which may depend on the sample results. 
We shall define a sufficient statistic to be one for which the conditional dis- 
tribution, when z is fixed, can be factored as usual. Clearly by Theorem 1, 
Din nO, 1) = wis a sufficient statistic and in estimating N(O) it will be suffi- 
cient to use » and zx. For some procedures (stopping rules) there may exist a 
function g(u) of the total number y» of untagged elements drawn from N(0) 
which is such that g(u) = z. For such procedures it will be sufficient to use u 
in estimating N(0). 

CorRoLuaRY 1. Given S(L, nj), then p = Din nO, 1) is sufficient for esti- 
mating N(0). 

Proor. Let g(u) be the least integer g such that 


n=2Lb+uh. 
This function g(u) is such that g(u) = x. Q.E.D. 

In view of the preceding results, without any loss of generality, we shall, 
henceforth, consider the problem of estimating the size of a population, which 
is not divided into classes, by means of nondifferentiating tagging methods. 
We let yu; denote the number of untagged elements drawn in the ith sample of 
n; elements. 

THEOREM 2. Given S(L, n,), there exists a minimum variance unbiased estimator 
(m.v.u.e.) M(). 

Proor. It is clear that if n = Din ui, then Pr{y; N, n;} is zero for» > N 
and is positive for m < uw S N. Hence, an unbiased estimator M(u) would 
satisfy the system of equations 


zs 
2) N= D0 Mu) Priu; N, ni} 


=n 


for N = m,m + 1,m + 2,--- . This system of equations defines an estimator 
M(u) uniquely since its values can be determined recursively for u = m ,m + 1, 
ny + 2, m + 3,--- . That M(u) is the minimum variance unbiased estimator 
follows from Corollary 1 and Blackwell’s result [6] that given any other unbiased 
estimate, one can obtain an unbiased sufficient estimate whose variance is at 
least as small. The uniqueness of an unbiased sufficient estimate insures that 
M (uy) is the desired estimate. Q.E.D. 

The reader will see by examining the preceding proof that a result similar to 
Theorem 2 for somewhat more general procedures than S(L, n,) will still hold. 
This result indicates one advantage of sequential procedures since Chapman 
({4], pp. 149-150) has shown that no unbiased estimators exist for the procedures 
he was considering (fixed number of samples). 

Coro.uary 2. For S(L, n;) the m.v.u.e. is 


| 
| Gij 


| 
M(u; L, nj) = Tbe |? 
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i=~u—-n+ 1 


om + i= 1m +f — 1), i<u—m+4l, 


C(n,, m + J a 1) 


with b;; = 0 when undefined and a;; = b;; fori <u — m1 + 1dy-nyaa,3 =M+t7—-1, 
and g(u) is as in Corollary 1. 

Proor. In view of the proof of Theorem 2, we need only show that M(u; L, n;) 
satisfies the system of equations (2). Since the number of samples K is a func- 
tion of u, and hence fixed for each u, equation (1) may be used to give 

Pr{u; L, N,n:} = pe h(u, m, L). 
II C(t, N) 


sal 


Now since 


x 
> Pr {u; L,N,m} = 1, 
bani 

we have 
C(n, m4) 

uaF « Ah, ee, EL) = I, 


Il C (n,, 1) 


s=1 


C(ny, ny + 1) C(ny + 1,m + 1) 
sy him, nz, L) + sae h(n + 1,m,L) = 


I] Ci, m1 + 1) I] Ciu,m+ 0 


sal s=l 


C(m, m1 + 2) C(nmy, + 1, m1 + 2) 
es A, 1, L) eg ss hms + 21, ue, LZ) 
Il C(n,,m + 2) II Ctn,, mu + 2) 
s=l s=1 
C(ny + 2, m1 + 2) 
+ ——ss) @£=£=~2~—() h(n + 2, NM, L) = 


II] C(n.,m + 2) 


e=l 


Using Cramér’s rule, we have 


h(u,n,, L) = | bis ; 


1 Ce; | 


where 
Cin +71—1,m%+7-1) 
oe eee 


C(n,,m +7 — 1) 


s=] 
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for i,j = 1, 2,-+-,u— m + 1 (with c;; = 0 when undefined). Suppose M, 
is the m.v.u.e. and let 
M,h(u, m , L) = flu, m , L). 
Hence, by equations (2) 


* Clu, N : 
Se ee 


— I] C(n,, N) 


for N = n,m + 1,m + 2,--- and 


fl, m,L) = a . 


Therefore 


leu! leu! — lau! 
M, = Mu; L, = vente 
ahead Wh | tad | 


3. The m.v.u.e. in special cases. 
TuHeorEM 3. For S(L, 1), the m.v.u.e. is 


K(u, L) 
K(u, L — 1)’ 


where K(y, L) = nonin 8) en v> wat Wes et z, when there are L summa- 
tions; for example, K(u,0) = uw, K(u, 1) = w'(u + 1)/2. 

In order to prove this result, we first prove 

Lemma 1. 


M(u; L) = 


= K(u 5) See oS. N) = N**! 


wml 


> Ku, L) 


pam t t — 1 s=0 


co ta _ Ct, N) - K(t od l, s) 


T Gun, ates ? 
N L+s—1 


1l<tsN. 


Proor. We first consider the special case L = 0. We wish to prove that 


“ C(u,N) — C(t, N) : . 
. = — = 1,2,---,N. 
De Ne Ne ’ t l, ’ ? 
The equality clearly holds for t = N. Now let us suppose the equality holds for 
t fixed, and consider t — 1. The left side of the equality becomes 


C(t at ' ~ » Se 
C(t—1,N) , C(t,N) _ NC(t -— 1,N) 
_ C(t — 1,N) 
a 


= (t — 1) 
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Hence, by inverse mathematical induction we have proven the lemma for the 
special case where L = 0 and t = 1, 2, --- , N. The lemma may now be proved 
in the general case by induction on L. 

Now the proof of Theorem 3 follows as an application of Lemma | and the 
methods used to prove Corollary 2. 

In general (yu; L) may be expressed as a rational function in yp, 

rt 2 3 P, 
M (u; L) = dL + ( — a) + P,’ 

where P, and P: are polynomials of degree 2L — 1. Using various recursion 
relationships, the exact calculation of M(u; L) may be simplified by means of 
tables (see [3]). 

THEOREM 4. Consider a modified S(1, n;), where the only change is the addition 
of the condition that no more than R samples are to be drawn. Then 


C(t(R) + 1, N) 
II ca.) 


a=] 


E{M(u; n)} = N — 


’ 


where 
° _Ille ts _¥ 2 2 ~ | pie =) 
Mu; n,) = sf (x 1) 2 ni + 2t(x | + wz [pte ’ 


when x is the number of samples drawn before sampling ceased, 


a 


z—1 z 
m= Dntus, tz) = Dn. 


s=1 s=1 


To show that this theorem is true we first prove 

Lema 2. For any positive integersn S A < N, then 

t joc + A) + i] C(N — n —j, N — A)CG, A) 
j= A—n 


A-jt+l 





(N — n — j)!j! 
+ C(n+1,N — A)/n! = C(n, N)N/n!. 
Proor. We have the following identity 
(1+ y)" “(1 + y)*in + (n+ Dy} = (1+ y)"n t+ (n+ 1)(1 + y)*y. 


n 


The coefficient of y*~ 
of y is 


‘in the expansion of the right side of the identity in powers 


C(n, N)/(n — 1)! + C(n + 1, N)/n! = C(n, N)N/n!, 


the right side of the equality stated in Lemma 2. The coefficient of y*~" in the 
expansion of the left side of the identity in powers of y may be seen to be equal 
to the left side of the equality in Lemma 2. Hence, the equality is proven. Q.E.D. 

We may prove Theorem 4 for the special case R = 2 using Lemma 2 by set- 
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ting A = n,n = nz, andj = nm — nz + we. We may then proceed to prove 
the theorem in general by mathematical induction on R using again Lemma 2. 

By this method of proof we also see that M(u; n,) is the m.v.u.e. for the 
original S(L, n;). The calculation of M(u; n;) simplifies considerably in the 
special case where nz = n3 = m = --: = n (which is of some interest in 
applications). Similar kinds of results may be obtained for L = 2, and 
Ne = Ng =m = --- SN, 


4. Limit theorems. An interesting dual relationship exists between the family 
of sampling procedures S(L, n;) and F(K, n;) which will be useful. Namely, if 
H(L; K, n;, N) is the distribution function of the total number L of tagged 
elements appearing when the sampling procedure F(K, n,) is used, and if 
G(K; L, n;, N) is the distribution function of the total number K of samples 
drawn before sampling ceased when S(L, n;) is applied, then 

G(K; L,n;,N) = 1 — H(L — 1; K,n;, N). 
We also have, then, that 
h(L; K,n;,N) = H(L; K,n;,N) — H(L — 1; K,n;, N) 
G(K; L,n;, N) — G(K;L + 1,n,, N) 
and 
g(K; L,n;, N) = G(K; L,n;, N) — G(K — 1;1,7;, N) 
H(L—1;K —1;n;,N) — H(L —1;K,n;,N). 


Henceforth, these relationships will be designated as Relation A. 
Throughout the following sections [n,] is to be any bounded sequence of posi- 
tive integers. We now have 
TuHeEorEM 5. Let K(N) be any integer-valued function such that 
. t(K(N)] 
lise eee om 
a ¥s 


{z] = : Ni. 


—y/2 L 
lim h(L; K,n;,N) = © (2) ; 


N-20 L! 
where h(L; K, n;, N) is as in Relation A. 
In order to prove this result, we need the following lemma: 
Lemma 3. 


2 


C(t{K] — L, N) 
I] ct, N) 


s=1 


hA(L; K,n;, N) = p(n;, L), 
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where p(n; , L) is a polynomial in n; such that 


([K}* re 


p(n, L) = ar 


+ O(¢{K])**", 


when K = K(N). 
This lemma may be proved directly from the fact that 


{K]’ = Donini, ++ ni, + O({K])~ 


where the summation is taken over all sets (7; , i2, +--+ , tz) such that 7, ¥ 7, 


for s ~ tandi, = 1, 2,--- , K. This statement may be proved by induction 
on L. 


Theorem 5 now follows from Lemma 3 and the fact that 


C(t(K[N]), N) 


9 


aanasslciienstillacesiccteantitl = 
1M “K{N] é ; 


rr I] Cn, x) 


a result obtained using Stirling’s formula. 
TueoreM 6. Let D(y) be the distribution function of ((K)/N = y where t(K) = 
a n, when sampling proceeds’ according to S(L; n;). Then 


. 1 P ~s as 
lim D(y) = arp | a *" de, 

The result follows by induction on L and use of Relation A applied to The- 
orem 5. 

We also have 

Corotuary 3. If S(L, ni) is used, then the limiting distribution of t/+/N i 
1x? with 2L degrees of freedom. 

Proor. This is a direct application of a theorem stated by Wilks ({7], p. 219). 

Suppose the sampling procedure S(L, n;) was independently repeated m, 
times, for L = 1, 2, --- , )-Z-1 mz = m. By the preceding result it is clear that 


mI 


arte) = TE PLT Rats LONI FF ye ace 


is approximately true when dealing with large V, where ¢,; is the number of 
elements drawn when S(L, n;) was performed for the 7th time. Henceforth, we 
shall assume that the preceding statement is exactly so; that is, all statements 
made should now be interpreted as being only approximately true, the approxi- 
mation being close when we are dealing with large N. 

Since it is easy to see that )>7-1 >07-4 ti: is sufficient for estimating NV, we 
may use a result stated by Kendall ((8], Vol. 2, p. 54) and the Blackwell the- 
orem [6] to show that 
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>> 4. 


uU ‘an L=1 i=l 
2 > Lm, 
L=1 
is the m.v.u.e. (asymptotic) of NV. It can easily be seen that 
N2 
. ] 


ou = oo , 
ie Lm 
L=1 


and hence we may obtain standard errors for our estimates of N. 

We shall now see that the best results will be obtained when )>7-; mz = 1; 
that is, when S(Z, n;) is not repeated. More precisely: 

THEOREM 7. Consider all sequences of nonnegative integers m,, M2, M3, °** , 
such that >it Lm, is a fixed integer. Suppose the sampling procedure S(L, n;) was 
independently repeated m,, times, for L = 1, 2, 3,--- . Then the variance of the 
m.v.u.e. (asymptotic) of N is the same for each sequence. The expected number of 
elements drawn is a minimum when was m, = 1. 

Proor. Since oy = N’/ 2 Les Im_, the variance of M the m.v.u.e. (asymptotic) 
of N is the same for each sequence. The expected number of elements drawn, 
divided by ~/N approaches 


p as Var mr r(2L) iT (L) 2? 


L=1 
as N becomes large. The following inequality may be proved directly: 


r(2>> Lmz) 2 
L=1 


ome eg eeonmesenenennny A, 
I” ( 7 Lm._) g?= imi 
L=1 


r(2L) 
We =a > 
L=1 I’(L)2 


equality holding only in the trival case where ) 7-1 mz = 1. Hence, the expected 
number of elements drawn is a minimum when 57; m, = 1. Q.E.D. 

Hence, we see that the amount of information about N is greater when S(L, n,) 
is performed once if the expected number of elements sampled is held constant. 
Since M is also an efficient estimator of N, we have the result that (—1 + M/N) 
Vy, Lm, approaches normality, as >> Lm, becomes large in a sense more 
rapidly when we confine ourselves to not repeating S(L, n,). 


5. An exact distribution and comparisons with limit results. It is clear that 
the distributions of (’/N and yu’/N behave similarly for large populations. We 
believe that when the n, is relatively small, then the use of ¢ gives us a good ap- 
proximation to the limiting results. However, when x, is not small, then » + L 
might be better. The following table shows that when the population size is 
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only N = 365 (the archeologist’s problem), L = 1, and n, = 1, we obtain good 
approximations using y, and still better approximations using » + 1 = f¢. 


TABLE 1 


Comparison of the exact distribution and the x2 approximation with 2 degrees of freedom 
for N = 365 


Probability | .01 .02 .05 . ot oO 5 oe 8 9 95 


02 .04 .10 .! 45 .71 1.39 2.41 3.22 4.61 


t?/365 02 .04 .10 .22 .46 .70 1.33 2.30 3.17 4.38 5.80 


For example, Pr{x’ < .71} = .3, Pr{t?/365 < .70} = .3, and Pr 
{u’/365 < 62} = 3. 


6. Approximate confidence intervals, fiducial limits, and tests of hypotheses. 
We repeat that all statements made in the following sections should be inter- 
preted as approximately so, in so far as we shall assume that (’/N has its limiting 
x’ distribution. 

Since the distribution function of ¢’/N is independent of N, we may set up 
one- or two-sided confidence intervals for NV. It is easy to see that the theory of 
fiducial inference leads to the following fiducial distribution of V: 


e 7/2" an 
dF(N |i) = (€/N)* ——— 

wins PUR 
and that limits obtained in this way for N are the same as those obtained by the 
confidence interval approach. 

To test the hypothesis at a level of significance of « that VN = N» against the 
alternative that V > No, the region of rejection is f > ¢,No where c; is such that 
| dF(x° | 2L) = «. A similar result holds for the alternative N < No. 


1 
To test the hypothesis at a level of significance of « that VN = No against the 
alternative V # No, a uniformly most powerful unbiased test exists. The region 


of rejection is & < c:.No and t > c;No where ec: and ¢3 are such that | dF (x’ | 2L) = 
c2 


L —e3/3 


L —co/2 . . 
1 — e€ and ce *” = ¢3¢ . If we wish to test the hypothesis that two 
populations were of the same size, Fisher’s Z distribution, Z 


with parameters v, = 21, and v. = 212, seems appropriate. 


7. Comparisons. Let us first consider the nonsequential sampling procedure 
F(1, n). It is easy to see that the maximum likelihood estimate of N in this case 
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is the greatest integer less than or equal to n°/m = 1, where m is the number 
of marked elements appearing in the second sample. When n = O(+/N), we 
have that the expected value of 7'/N approaches infinity. We have considered 
the case where n = O (+/N) since the sequential sampling-tagging with which 
it is compared has the property that the expected number of elements drawn is 
O (\/N). However, if we are willing to draw sample of size KN, then, although 
T is still not unbiased, we have that 7'/N converges in probability to one. 

Suppose we consider the following sequential procedure. A sample of 7, ele- 
ments is drawn from a population P of N elements, tagged and replaced in P. 
Then elements are drawn one at a time and then replaced until a total of L 
tagged elements have been drawn. (This is similar to the sequential sampling- 
tagging procedure we have been discussing except that np = n3 = --+ = 1 and 
we do not tag elements appearing after the first sample.) This scheme was studied 
by Haldane ({9], p. 222) for the purpose of estimating p = n,/N, while we are 
interested in the estimation of n/p. We can without much difficulty show that 
if this sequential non-retagging method is used, then 


E{ny/L} = N, 


and o (ny. = (N* — mN)/L, where y is the number of elements drawn, after 
the n, of the first sample, before sampling ceased. If we consider n, as fixed, then 
the total number of elements drawn is O(N), whereas the variance is no better 
than that obtained in the sequential sampling-tagging case for only O(./N) 
elements. Now let us suppose that nm) = WAN. Then E(y) = L./N/K and the 
expected value of the total number of elements drawn is NK(1 + L/K). 
Since K # 0, it is easy to see that for a given L, the minimum value of .WNK 
(1 + L/K) is obtained when K = L, and so we have 2\/NL. When N is large, 
we have seen in the sequential sampling-tagging procedure that the expected 
value of the total number of elements drawn is about /N E(,/x2), where x’ 
has 2 degrees of freedom. Since E(x*) = 2L, and since WE(x?2) = E(./x2), 
we have that 2/NL 2 V2NL 2 E(\/x*) VN. Hence, we have shown in this 
case that, for a given variance, the expected value of the total number of ele- 
ments drawn before cessation is smallest when the sequential sampling-tagging 
procedure is used for large populations. Let us now consider the case where 
ny = KN, K < 1. Then for any sequential nonretagging procedure, there exists 
a sequential sampling-tagging method which obtains estimators whose variance 
is smaller, and also has the property that the expected value of the total number 


of elements necessarily drawn in the process is smaller than in the given sequen- 
tial nonretagging method used for large populations. 


8. A numerical illustration. Seven samples of 100 (n; = no = ... = Ne = Nz) 
followed by samples of one (ng = my = --- = 1) were drawn from a population 
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of N = 10,000 random numbers [10]. Sampling ceased when a total of L = 25 
elements had reappeared. On the basis of the sample results, we then estimated NV. 
The standard deviation of our estimate is about ¢ = 10,000/5 = 2,000. The 
results of this sampling experiment are summarized in Table 2 which Mr. Syl- 
ranus Tyler of Argonne National Laboratory was kind enough to prepare. We 
find that estimate of N is 0/50 = (756)°/50 = 11,704.5. Also an estimate of o 
is 11,704.5/5 = 2,340.9. The case where N is known is of little practical use but 
serves to illustrate the methods presented herein. 


TABLE 2 


Reappearance of tagged elements in samples from a population of 10,000 
random numbers 


Tagged Elements Elements 


Sample Elements in Sample ; 5 . 
I in Population Reappearing 


100 0 
100 100 
100 196 
100 292 
100 391 
100 488 
100 586 


65 679 


WHS ae WN 
eI we + S © 


oo 


765 740 


bo 
or 
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ON WALD’S COMPLETE CLASS THEOREMS' 


By J. Kierer 
Cornell University 


1. Summary. The purpose of this paper is to prove certain results concerning 
complete classes of strategies, some of which were announced in an abstract in 
Bull. Am. Math. Soc., Vol. 57 (1951), p. 372. 


2. Introduction. Except where explicitly stated to the contrary, we shall use 
the nomenclature and notation of Chapter 2 of [1] concerning zero-sum two- 
person games. Our considerations here do not require, however, that the payoff 
function K(a, b) be bounded (or finite), but merely that it be bounded (by zero, 
without loss of generality) from below (because if unbounded in both directions, 
expectation relative to a mixed strategy might be undefined). This generaliza- 
tion is of use in some games and statistical work, as will be seen below. We 
remark without proof that such results as weakened forms of Theorems 1 and 4 
of [2] may be proved under this set up. For example, we shall later use the 
following: 

THEOREM 1. Suppose that = and H are convex spaces of allowable strategies for 
players 1 and 2, respectively, that0 S K(a, b) S @, that H is weakly compact 
relative to = in the sense of Wald (i.e., for any sequence {n,} in H there is a subse- 
quence {ni;} and an m in H such that lim inf;.... K(é, ni;) 2 K(é, m0) for all é 
in =), and that there exists a sequence {§;} in = such that for any — in = and n in 
H there is a subsequence {&;;} (all of whose elements may be the same) which may 
depend on & and n and is such that lim;.,, K(&; , n) 2 K(é, 7). Then the game is 
determined. 

The weak compactness assumption is enough to assure the existence of a 
minimax strategy for player 2. The above conditions may be weakened as in 
Theorem 4 of [2] or even further, and a generalization of Theorem 5 of [2] (which 


should be corrected there by assuming gp to be independent of €) may similarly 
be proved. 


3. Admissible strategies and complete classes. Wald considered two types of 
complete class theorems: those which give conditions under which the class of 
admissible strategies (e.g., of player 2) is complete, and those which give condi- 
tions under which the class of minimal strategies in the strict or wide sense is 
complete. The latter will occupy most of this paper. We remark, regarding the 
former, that the proof used by Wald in Theorem 2.22 of [1] actually suffices to 
prove the following: 

THEOREM 2. Let = and H be arbitrary spaces of mized strategies with the property 
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that there exists a denumerable subset =* of = such that, if n' and n” are any mem- 
bers of H for which K(é, n') S K(é, 9”) for all & with strict inequality for some £, 
then there is a &' in =* with K(#, n') < K(é’, n”). Suppose also that H is weakly 
compact relative to = in the sense of Wald. Then the class of all admissible strategies 
of player 2 is minimal complete. 

Note that K(é, 7) is not assumed bounded. An application of this theorem 
which indicates the usefulness of the hypothesis as stated herein over the stronger 
condition stated in Theorem 2.22 of [1], will be given in the next paragraph. We 
remark here that the condition of Theorem 2 is not necessary; for example, 
let A as well as B consist of all integers, = and H consist of all probability meas- 
ures on A and B, and K(a, b) = Oif a = b > O and K(a, b) = 2-” otherwise; 
the class of all admissible strategies (those giving probability 1 to a single ele- 
ment b > Q) is then minimal complete, but H is not even weakly compact for 
every sequence of strategies for which each strategy is better than its predecessor 
(as is evidenced by the sequence of pure strategies b = 0, —1, —2, --- ). On the 
other hand, the theorem does not remain valid if only weak compactness (but 
not the condition on =*) is assumed. For example, let A as well as B consist of 
all ordinals less than the first uncountable ordinal, let = and H consist ofall 
discrete probability measures on A and B, and let K(a, b) = —1 or 1 according 
to whether a < b or a 2 Db, respectively. Then the condition of weak compact- 
ness is satisfied, but no strategy is admissible. (This example also illustrates 
why weak compactness alone is not enough to insure the determinateness of the 
game.) The above theorem may be generalized in an obvious manner by replacing 
the condition of weak compactness by a similar one on all well-ordered subsets 
of H whose power does not exceed that of some infinite =* with the stated prop- 
erty. (It is enough to consider only subsets of H whose members become “‘better’’ 
with increasing index.) It follows that the bicompactness condition used in 
Theorem 3 of [2], which implies such a condition for every subset of H, also im- 
plies the conclusion of Theorem 2 above. 

As an important statistical application of Theorem 2, which also illustrates 
the advantage of using the condition on =* stated therein rather than that of 
the separability of =* in the sense of intrinsic metric (2.4) of [1], we shall now 
prove the following: 

THEeoreEM 3. Under Assumptions 3.1 to 3.6 of [1], the class of all admissible deci- 
sion functions is minimal complete. 

This theorem extends the result of Theorem 2.22 of [1] to the setup of Chap- 


«x _ 


ter 3 of [1]. To prove it we let =* = UL, =; , where =; is a denumerable set of 
a priori distributions which is dense in = in the sense of the metric pi(& , f) = 
sup | r(&, 6) — r(é, 4) |, the supremum being taken over all decision functions 
6 requiring at most 7 stages of experimentation. The existence of such =; follows 
from Theorems 3.3 and 2.16 of [1]. We shall show that =* satisfies the assumption 
of Theorem 2. Let 6; and 6, be two decision functions and ¢€ a positive number 
such that r(¢) 2 O and sup: r(é) > 2e, where (using the notation of [1]) r(é) = 
r(é, 6) — r(&. 62) (with the definition 2 — 2» = 0). We need only show that 
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r(é’) > O for some é’ in =*. Let ri,m(&) = sup r(é, 6”), the supremum being over 
all 6” requiring not more than m stages of observation and such that r(é, 6”) 
r(é, 61) + ¢€; if no such 6” exists, we define r:, ,.(€) = 0. Similarly, let re, »(€) 
inf r(é, 6”), the infimum being over all 6” requiring not more than m stages of 
observation and such that r(é, 6") = r(é, &); if no such 6” exist, we define 
Tom(é) = +2. Clearly, for each &, r1,m(&) is nondecreasing with m (we assume 
without loss of generality that the weight function W is nonnegative) and 
r2,m(€) is nonincreasing with m. Moreover, noting Lemma 3.3 of [1], and de- 
fining rm(E) = Ti,m(E) — Te,m(é), we see that r(é) S lim, .. rm(&)for every & for 
which r(é, 62) is finite. Moreover, rm(&) S r(é) + € is nondecreasing in m, 
so that 

e+ sup r(é) > sup limr,,(£) = sup sup ra(€) 


E«=* mx g«=* 


sup supr,(&) = sup sup r,(€) = sup sup r,,(€) 


m §e=* ™m fe= Ee 


> sup r(t) > 2e, 


fez 


completing the proof. (It is essential here that r,,(&) is increasing in m, so that 
the operations “lim” and ‘‘sup’”’ may be interchanged.) 


4. Minimal strategies and complete classes. We now turn to our main theorem, 
which generalizes Theorem 2.25 of [1]. The proof of the theorem is followed by 
two applications. The first of these is an essential strengthening of Theorems 
3.17 and 3.20 of [1] regarding statistical decision functions. The second weakens 
the conditions of Theorem 2.25 of [1], even when K(a, 6) is bounded. 

The idea of forming a new game with payoff function A*(a, b) is Wald’s, and 
the proof of the first part of the conclusion of the theorem below is that of The- 
orem 2.25 of [1] if K(a, b) is bounded. (The last part of the conclusion was proved 
under the stronger conditions that A(a, b) is bounded and A and B are compact, 
so that minimality in-the wide and strict senses are equivalent, in Theorem 3.10 
of [4].) In the bounded case, any condition entailing the determinateness of 
the game and existence of a minimax strategy for player 2 and whose validity 
relative to K implies its validity relative to K* (e.g., the condition of Theorem 
2.25 of [1] or of Theorem 3 of [2]), also obviously entails the conclusion of the 
theorem below. When K(a, 6) is unbounded, one must be careful touse A*(é, n) 
only where K(é, n) and K(é, m) are not both infinite. Otherwise, K* may not be 
properly defined. At the same time, it is useful to state the theorem in terms of 
the Zy of the theorem rather than only in terms of A*, since in many applica- 
tions the Zy may be chosen so that K* is bounded from below on each Zy (but 
not necessarily on A*), so that in verifying condition (b) in applications one 
may use such results as that italicized in the first paragraph of this paper. 

We recall (putting « — © = 0 in our case) that a strategy »’ is minimal in 
the wide sense if 


(1) inf [K(é, n’) — inf K(é, )] = 0. 
ge= neH 
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TuHEeoreM 4. Suppose 0 S K(a,b) S ~, = D A, H D B, and that for any m 
for which inf, K(a, mo) < ~*~ and which is not a member of the class C » of all mini- 
mal strategies in the wide sense, there exists a sequence {=;}(t = 1, 2, ---+ , ad inf) 
of subsets of = such that 

(a) lim infy.<. Zn D A* = {a| K(a, m) < ~}; for every N, K(&, m) < & 
for all § in Zw ;if supa K(a, m) < ©, Ey DA; 

(b) the game relative to =y , H, and K*(&, n) = K(é, n) — K(E&, 0) is determined 
and player 2 has a minimar strategy for this game. 

If supa» K(a, b) = +, suppose also that H is weakly compact relative to A * 
for each no ¢ Cw for which inf, K(a, m0) < supa K(a, mo) = +. (Jf H is weakly 
compact relative to A, this is automatically satisfied.) 

Then Cw is complete. Moreover, for any no not in Cw there is an m in Cw and 
an € > 0 such that K(é, m) S K(&, m) — efor all in =. 

Proor. We suppose Cw # H, or the theorem is trivial; in particular, inf,, 
K(a, b) < «, since otherwise Cy = H. We now show that Cy is not empty. If 
there is an m ¢ Cw with sup, K(a, m) < R < @, it follows from (b) that there 
is a minimax strategy 7’ relative to Zy, H, and K*. Since this game is deter- 
mined and = > Ey > A in this case, it is easy to verify that the game relative 
to =, H, and K* is determined, that »’ is minimax for it, and hence that 7’ is 
minimal in the wide sense relative to =, H, and K* (since 0 2 K*(a, n') = —R, 
the proof of Theorem 2.17 of [1] applies), and hence relative to =, H, and K. 
On the other hand, if no such 7 exists, the first sentence of the proof shows that 
there must exist an m with non-empty A* and (by the assumption following 
(b)) such that there exists a minimal strategy relative to any member of A*. 
At any rate, C» is not empty. 

Let m be any member of H which is not in Cy . If A(a, m) = + for all a, 
any 7’ in Cy (which is non-empty by the previous paragraph) is uniformly better 
than and is such that K(é, n’) S K(é, m) — 1 for all &. Hence, we may assume 
in what follows that inf, K(a, 7) < ©, and that the =y corresponding to this 
nm are non-empty for all V not less than some Ny. We now let =* = U%., Ey, 
and define 


(2) e = inf [K(é, m) — inf KE, n)]. 
gez* 7 


(It is clear that inf, A(é, 7) < © for all & Otherwise, every 7 in H would be 
minimal and we would have Cy = H.) Clearly, « > 0, or by (1) (with n’ = m0)m 
would be minimal in the wide sense. Moreover, « < ~, since Zy, is non-empty. 

For any N = Nj, let ny be a minimax strategy for the game described in (b), 
so that 


(3) sup inf K*(¢,») = inf sup K*(é, ») = sup K*(E, ny). 
n En 


geZn 9 


Ee=n 


The common value of (3) is less than or equal to —e; for if, to the contrary, it 
were —e + 2p for some p > 0, there would by (3) exist a & in Zy for which 


(4) —e+ pS inf K*(&, 7) = inf K(g,) — K(k, m), 
” n 
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which would contradict (2). Hence, we must have 


(5) K¢é, nw) < K é, no) -_'® for all E in =n. 


Let the subsequence {NV ;}(j = 1, 2, --- , ad inf) of the positive integers and the 

strategy n* eH be (as guaranteed by weak compactness relative to A* if sup, 

K(a, m) = +, and putting n* = ny if sup, K(a, m) < © and assuming 

without loss of generality that Z=y = Zy- for N > N’ in this case) such that 

) lim inf K(a, ny;) = K(a, n*) for all a in A*. 
jos 


It follows from (5) that 
(7) K(a, n*) S K(a, m) — € for all ain A*; 


that is, K(a, n*) S K(a, m) — e for all a for which A(a, m) < &. Since the 
latter set is nonempty, 7* is uniformly better than 7 , and in fact 


(8) K(é, n*) & K(&, m) — « for all £ in = 


The minimality in the wide sense of n* (i.e., the verification of (1) for n’ = n*) 
is a direct consequence of (8), the fact that =* is nonempty, and (2). This com- 
pletes the proof of the theorem. 

AppLicaTIon I. In the terminology of Chapter 3 of [1], let D be the class of 
all decision functions and ®, the class of all decision functions with bounded 
risk functions. Let C, be the class of all Bayes solutions in the strict sense and 
Cw the class of all Bayes solutions in the wide sense. Wald showed that, under 
Assumptions 3.1 to 3.6 of [1], Cw is complete relative to D, (Theorem 3.17 of 
{1]), and that, under Assumptions 3.1 to 3.7 of [1], C, is complete relative to 
©, (Theorem 3.20 of [1]). (These theorems were also proved by Wald under 
stronger conditions in [3], [4], and [5], and were stated under stronger conditions 
in [6]. In [3] and [4] (by Condition 7 of the latter) the risk function is always 
bounded. Theorems 2.6, 2.7, 3.5, and 3.6 of [5] are stated correctly, relative to 
®, . The proofs of Theorems 2.5 and 3.4 of [5] are correct only if the statement 
of these theorems is interpreted relative to D, ; otherwise, the statement follow- 
ing equation (2.72) of [5] is false, since the W* defined there need not satisfy 
Condition 2.2 of [5}). If, using Wald’s notation and in particular putting 65 
for m and r(F, 69) for K(a, m0), one defines the Zy of our theorem to consist of 
all € for which (Ay) = 1, where Ay = {a| K(a, mo) S N}, it is easy to verify 
that Ay, the terminal decision space D‘, and the weight function W*(F, d) = 
W(F, d) — r(F, 69) (when restricted to Ay) satisfy Assumptions 3.1 to 3.6 (and 
3.7) of [1] whenever 2, D', and W(F, d) satisfy the corresponding assump- 
tions. Hence, Theorems 3.4, 3.7, and 3.2 of [1] imply (putting r*(F, 6) = 
r(F, 6) — r(F, 69) for our A*) that condition (b) and the condition which follows 
it in our theorem are satisfied, so that the conclusion of Theorem 4 holds. Hence, 
we have proved the following: 

TueoreM 5. In the statements of Theorem 3.17 and Theorem 3.20 of [1], Ds 
may be replaced by &. 
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The proof for Theorem 3.20 uses Theorem 3.15. The last part of the conclusion 
of the Theorem 4, when applied to the present case, yields a result not proved 
in [1] but proved under stronger conditions (e.g., all risk functions are bounded) 
in Theorem 4.11 of [4]. 

APPLICATION II. Suppose 0 S K(a, b) S ~, that = and H are convex, that 
H is weakly compact relative to =, and that there is a countable subset =* = 
\&:} of = such that, given any € in =, there is a subsequence {£;,} of =* (whose 
elements are not necessarily different) such that lim; A(é:,;, 7) = A(é, n) 
for all » in H. We define Z=y = {&| K(&, m) < N; &€ =}, and we note that only 
(b) need be verified to assure the applicability of Theorem 4. It is easy to verify 
that H is weakly compact relative to Zy and the payoff function K*. Moreover, 
for any & in Ey there is by assumption a subsequence {£;,;} of =* with lim,., 
K(&;, , n) = K(&, ») for all in H. In particular, this holds for 7 = m , so that 
K(&i; , 0) < N for sufficiently large 7. We conclude that =* M =y satisfies rela- 
tive to Zy , H, A, and hence relative to Zy , H, K*, the same relationship that 
=* did to Z, H, K. From Theorem | stated in the first paragraph of this paper, 
we conclude that (b) is satisfied. 

Even when K(a, 6) is bounded, the above condition of weak sequential separa- 
bility is weaker than the strong separability condition used in Theorem 2.25 


of [1]. 
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A CLASS OF EXPERIMENTAL DESIGNS USING BLOCKS OF 
TWO PLOTS! 


By O. KEMPTHORNE 
Iowa State College 


1. Introduction and Summary. Over the past fifteen or so years, a large 
number of classes of experimental designs have been evolved by Yates, Bose 
and Nair and others (see [1] for a systematic account). The aim in all cases 
was to evolve patterns of observation which could utilize natural groupings in 
the experimental material, such as for instance litters of mice or small numbers 
of plots perhaps contiguous to each other. By arranging the treatments to be 
compared in specific ways which utilize the natural grouping, it is possible to 
enable treatment contrasts to be estimated by comparisons of observations on 
experimental units, which we shall call plots, within the natural groups. This 
enables the comparisons to be made usually with considerably greater accuracy 
than would obtain if the experimenter were forced to randomize the positions 
of the treatments without respect to these groupings. 

In experimental work in some branches of biology, natural groups of size two 
are of fairly frequent occurrence, for example twins, or halves of plants, or halves 
of leaves. The development of experimental designs is not complete in this par- 
ticular respect. The designs which have been developed for blocks of two plots 
or experimental units are as follows: 

(1) symmetrical pairs (Yates, [2]) which require (t — 1) replicates if there are ¢ 
treatments. 

(2) quasifactorial designs if the number of treatments is a power of 2 (see [1] 
in this respect). 

It appears therefore that development of a class of designs using blocks of two 
plots is desirable, and this is the purpose of the present paper. 


2. Structure of the Designs. Let the number of treatments be n and suppose 
r replicates of each treatment are desired. The structure of the class of designs 
is that treatment 7 (= 1, 2, --- , n) is placed in a block with each of the treat- 
mentsi+s,i +s+1,---,i+s+pr-— 1, where each of the numbers 7+s 
toi + s + r — 1 is to be reduced modulo n, that is, is to be replaced by the 
remainder after dividing it by n considering 0 to be identical to n. This structure 
is possible only if 2s + r — 1 = n, that is, ifn + 1 — ris even. 

The pattern of observations is specified therefore by n and r, and n + r — 1 
must be even. The number r is the number of times each treatment is replicated 
and the total number of blocks is rn/2 and of plots is rn. In practice the treat- 
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ments would be arranged in random order before making up the pairs of treat- 
ments which are to lie in a biock together and these pairs would be assigned to 
the blocks at random, and the individuals in the pair assigned to the plot in the 
block at random. 


3. The Analysis of the Designs. On the basis of the usual assumption of addi- 
tivity of treatment effects, and by virtue of randomization, we may apply the 
method of least squares to obtain estimates of treatment comparisons. These 
normal equations may be obtained from the standard reduced normal equations 
for the two-way classification with unequal numbers, namely 

{ 


2) ; ) 
W WNa- Ea BE =e, 


Ss SVG.) i’ si . = Ny. 


where y;; is the yield of jth treatment in 7th block, n;; is the number of times jth 
treatment is represented in the ith block 


N.; = , ¥ Nij, N;. = y» Nis, 
t i 


4.0 %i50 7 oS ¥, 
i N i. 
where Y.; = 2 Yij3, Yi. = 23 yi; . In this instance N.; is r for all j, N;. is 2 
for all 7, and we get 


(2) : = Lodj)/: Fis = Q; 

J’ x) 
where \;;’ = 1 if 7’ and j occur in a block together and is zero otherwise. These 
equations may be written as A? = Q. 

The matrix of coefficients A in (2) is a special type of matrix, known as a 
circulant (see for example Ferrar [4]). The first row consists of r/2 followed by 
(s — 1) zero’s followed by r terms equal to —}, the remaining terms being zero. 
The second row is obtained from the first by moving the elements along one 
step, putting the last element of row one as the first element of row two and so 
on. In the present instance the circulant matrices we are concerned with are 
also symmetrical so that the characteristic roots are real. The roots are also 
nonnegative. 

We shall now review briefly some properties of circulant matrices. The circu- 
lant matrix will be denoted by [a,a2 --- a,], that is, by its first row. The proper- 
ties we need are as follows. 

(i) The determinant of the circulant matrix [a;a2 --- a,] is equal to 


II (a: + a2w; + aswi + +++ + anu? ), 


i=1 


where w;,7 = 1, 2, --- , n, are the nth roots of unity. 
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(ii) The characteristic matrix is also a circulant and therefore has a deter- 
minant equal to 
n 


2 —1 
I] fa: — \ + aw; + aqui + +++ + a,w?), 


i=] 


so that the latent roots of the matrix are 


1 


A, = a, + aw, + aw; + --- a,w) : ) 7 


The matrix A is singular and the equations therefore do not have a unique 
solution, corresponding to the fact that the 7,’s are not estimable (see for in- 


stance [1], p. 77). However, it is easily seen that any comparison of the 7,’s, 
say DA, , With 5°\; equal to zero is estimable. It is also known that we may 
impose any condition on the normal equations, A4# = Q, so that we obtain a 
unique solution say #) and that the best linear unbiased estimate of \’r = 
¥\37;, DA; = 0, is equal to \’f). The simplest condition which may be im- 
posed is that >.7; = 0 and the solution may be obtained by the device of writ- 


ing the normal equations in the form 


Ad *\ (Q 
(3) ie *) is) - e) 


where 3 denotes an n X 1 matrix whose elements are unity. We shall denote the 


ES 7 
oe O 


by A* and it is seen that A* is nonsingular, so that it has an inverse. 


matrix 


Now consider the inverse of A*. In full, we have to find the matrix C with 
element c,; , such that 


a, Qe ere an 


Cri Cn2 ae Cr Cn n+ ] 


Cn+1,1 Cn+1,2 aes Cn+1,n Cn+i1,n4+1 


0 


where /,4; is the (n + 1) X (n + 1) identity matrix. It can be shown easily 


that 





EXPERIMENTAL DESIGNS 


Cast =, a ay Cn+in+1 = 0, 


and that the matrix A* is symmetrical, so that C is also symmetrical. 

Now let w;, we, --* , Wa be the nth roots of unity, where w,;= 1, w. = w = 
cos 2x/n + i sin 24/n and w; = w*", and consider the first column of C. It is 
given by the equations 


c 
in 


0 


fi 
0 
0 


| - 
lon + a 1) | om | 
ie 1 aa Fer) 


Take any root of unity, say w; , and form the sum of row 1 plus w; times row 2 
2° ah * 
plus w; times row 3 and so on, tow? times row n, and we get 


i) —2 
(Cut wi Ce + wi ts + e+ + we)rdAi = 1, 


where A; is as defined above. Doing this for all the nth roots of unity we obtain 
the set of equations in the c;,’s. 
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so that 


1 | 
We 


W3 


| 
ae 
or} |; 


rn) 


By the same process one can show that the matrix C is a circulant after the last 


row and column, which we know, are blocked out. We have therefore obtained 
the matrix C. 


Utilizing the fact that the \’s are real, so that the imaginary parts must be 
identically zero, we have therefore 


jun = Gi + G2 cos 76 + az cos 278 + --- + a, cos (n — 1)76, 


where @ = 22/n, and 


n 


r 1 1 1 «cos (k = 1 ;—1)0 
(5) ‘1 = aD Loy 2 yt )G ) ; 


N jaz Aj Ax 


Since a; = r/2 and the remaining a’s are either —}3 or zero, we find that 


(6) hiss sr 4 sin (n — er. 


sin j6/2 


- 


Finally we have the fact that the intrablock estimates are obtainable from 


(7) by a (6): 


. . ° fae . 2 
and the variance-covariance matrix of the 7o;’s is oC. 


4. The Efficiency of the Designs. The standard measure of the efficiency of 
an incomplete block design is the mean variance of treatment differences in 
terms of o’, the error variance, divided into 20°/r, where r is the number of times 
each treatment is replicated. 

In the present case 


Var (7; - #5") o°(¢;; + try — 2c ;;") 


2 
20 (¢5; = C43"). 





EXPERIMENTAL DESIGNS 81 


Summing these variances over all n(n — 1)/2 possible differences, and utilizing 
the fact that the relevant part of C is a circulant, we get as the mean variance 
of a treatment difference: 


26*| en - wwe gy 1 = Dew + (n= Dew + + + cal 


2 


2 
2a" en — ma {ner + Ney + +++ + nevansne 


2 n ) 
“ie - {ne + ney + NC1,(n/2) + 5 cxcnret | 
if n is even. In either case it can be seen that the mean variance reduces to 
20°e(n/n — 1). We may note that cy is in fact equal to one-nth of the sum of the 
reciprocals of the non-zero latent roots of A. A simple mathematical expression 
for this sum has not been obtained. Finaily the efficiency factor of the designs is 
equal to (n — 1)/nrcy . Since the design is completely specified by n and r or s 
we have computed Table 1, which gives the efficiency factors for a range of 
values of n and r. 


5. The Utilization of Inter-block Information. As with all classés of incomplete 
block design, the information contained on treatment comparisons in block 
comparisons must be considered. Whether it is actually worth incorporating 
with the intrablock information depends on the extent to which the grouping 
of the experimental units into blocks achieves a marked reduction in the error 
mean square. 

The usual basis for the utilization of the interblock information will be used, 
namely that a block total, say, B; has an expected value 


n 
Dd bis 75; 
j=1 


where 6;; = 1 if treatment 7 is in block 7 and equals 0 otherwise, and is dis- 
tributed around this expected value with a variance of oj , say. On the basis of 
the model we are led to the reduced normal equations 


oR Mes 
(8) x(x bi; i) tj = - 63 Y;—- 97 — ’ 
e . 1 
where } is the number of blocks. 
Letting W = 1/o’ and W’ = 1/0; and combining these equations with equa- 
tions (2) we get as the estimating equations in our particular case 
WW) 9, — WW) F awe = wat 2 (7, - 7%), 
ii 4 


9 q 9 


a = 


j=1--, 
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where 7’; is the total of the blocks containing treatment 7. (See [2], p. 545, equa- 
tion 15, for example). 


TABLE 1 


Efficiency Factor of Designs 


Weeshor of Number of Replicates 
treatments 


‘ 


9 

10 

1] 

12 

13 a ; — 
14 ‘ 487 
15 oa . 
20 . 284 437 — .489 
30 210; — 22) — 447 


means design not possible, NC means not computed. 


The matrix of the coefficients is again a circulant so that the solution can be 
written out from the previous sections. The nonzero roots of the matrix are 


, ’ —_— an a “a /y 
mu +H yu W’) sin (n r)j0/2 


9 9 


A 541 = 


2 sinjé/2 
= (W— W’)rA\ju + ri’. 
The solution of the normal equations is therefore given by 
(10) ¢ = C*R, 
where C* is the circulant [cl, , clz, --* , Cra], and 
* 1 S cos (k— 1) (j — 196 
~~" .”)~—CU 
In just the same way as before we find that the variance of the estimated dif- 
ference between treatments 7 and 7 is 
(11) 2(chs — Cis) 
which is equal to 


‘ _* * 
2(¢11 — Ci { j—s} 41) 
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where |j — 7} = (j— 7) mod n. Similarly the mean variance of all treatment 
. ‘ * 
differences is (2n/n — l)en . 


6. The Estimation of the Weights, W and W’. In order to utilize the inter- 
block information it is necessary to follow the usual device of estimating the 
weights W and W’ for use in the estimating equations. 

The analysis of variance which must be computed to estimate W’ is given in 
Table 2, 


TABLE 2 
Analyses of Variance of Design 


df 5 df 


Blocks ignoring 
treatments. (nr/2) — 1 Treatments 
ignoring 
blocks 
Treatments 
eliminat- 
ing blocks. . 2 (nr/2) — 1 Blocks elimi- 
nating 
treatments 
Error. . ..... M(r/2—1) +1 Sg Sg n(r/2 —1) +1 Error 





Total... _nar-—i S.S nr-1 


where B is one half the sum of squares of block totals minus correction, 
, : 
T’ = ~ #,Q;, 
the 7’s being given by equation (7), 7 is 1/r times the sum of squares of treat- 
° e y 2 ° ° ° 9 
ment totals minus correction, S = dvi; minus correction, correction = >. Yis/nr 
and S, and B’ are obtained by subtraction. 
If we write ¢] = o + 20; , it may be verified that 


E(Sz) = E ¢ - 1) + | 0, 


and that 


— i) + n(r — 1)os, 


— 2)Sz \ 
9 


> = n(r — 1)(o + 2eoh). 
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We may therefore estimate W and W’ by 
In(5 ~ 1) ~ 1|/s 
n(r — 1) 
2B’ — 


2S82(n —_ 2) 7 
nr — 2n +2 


These values are then used in the estimating equations (10). 


7. Relation of Designs to Partially Balanced Incomplete Block Designs. The 
class of partially balanced incomplete block designs evolved by Bose and Nair 
{3} and later extended by Nair and Rao [5] has been found to contain many 
designs developed other than from the specifications of the class. It is clear 
(from for example [1], pp. 546-548) that this class of designs corresponds to a 
particular form of the reduced normal equations for the treatment constants 
(r). If corresponding to any one treatment j the remaining treatments (j’) can 
be divided into classes, say Sj,Sj2, +--+ Sjm within which \;; is constant, then 
the following must hold. Let G;, be the sum of the 7’s for the treatments in S ; ; 
then the sum of the normal equations for the treatments in S;, must give an 
equation in the G;,’s, the coefficients of which do not depend on the particular 
treatment 7 originally taken. 

From this point of view it is easily seen that the class of designs given in this 
paper belong to the class of partially balanced incomplete block designs. The 
classes S,; are as follows: S; consists of treatments 2 and n, Sy» of treatments 3 
and n — 1 and so on. If n is even, there are n/2 + 1 classes, the class S,,/241 
consisting of treatment (n/2 + 1). If n isodd there are (n — 1)/2 classes each con- 
taining 2 treatments. The associate classes for the other treatments are defined 
in a circular way, Sx consisting for example of treatments 3 and 1, Se of 4 and 
n, and so on. The representation of the class of designs given herein as partially 
balanced incomplete block designs is, however, of little value, because the num- 
ber of associate classes depends on the number of treatments and may be large, 
and because all analyses of partially balanced incomplete block designs have 
been worked out in terms of the number of associate classes. The accuracies of 
comparisons between treatments in the same or different associate classes is 
given by (11). 
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DISTANCE FUNCTIONS AND REGULAR BEST ASYMPTOTICALLY 
NORMAL ESTIMATES 


By WiuuraM F. Tay or? 
School of Aviation Medicine, Randolph Field, Texas 


Summary. Among the methods of obtaining satisfactory parameter estimates 
are maximum likelihood, minimum chi-square, minimum “reduced” chi-square, 
etc. This paper presents a generalization of the minimum chi-square method 
which yields regular best asymptotically normal (RBAN) estimates and which 
is often very simple to apply. It is shown that the least squares expressions asso- 
ciated with the logit and probit transformations are a type which lead to RBAN 
estimates. 


1. Introduction. In 1945, J. Neyman [1] presented at the Berkeley Symposium 
on Mathematical Statistics and Probability his work on “best asymptotically 
normal”’ estimates (now called “regular best asymptotically normal’’ or RBAN.) 
He gave for multinomial situations several methods of estimation which yield 
estimates having desirable asymptotic properties. The estimation techniques 
developed by Neyman were all based on the minimization of a special kind of 
distance function, namely, the x? goodness-of-fit expression or a similar one called 
the “reduced” x?. 

Certain work by J. Berkson [2] brought the author’s attention to functions 
which were a generalization of the x° distance function and which yielded esti- 
mates upon minimization. In this paper there is presented a class of distance 
functions which lead to RBAN estimates and which includes minimum x’, 
logit, and probit estimates. The theorem of Section 3 is proved via a lemma from 
results given in [1]. It has been pointed out to the author that this theorem may 
also be obtained readily from the work of Barankin and Gurland [4]. 

The author wishes to thank Professor Neyman and Dr. Berkson for their 
assistance in this work. 


2. Distance functions leading to RBAN estimates. Suppose the situation 
is the one described in [1], page 239. There are s sequences of independent trials, 
each sequence consisting of n; trials. A trial of the ith sequence can produce 
v; exclusive results with probabilities 


Pu, Pir, 9 5 Dives De Dis = i. 
j= 


Received 9/17/52. 
1 This work was done while the author was at the University of California, Berkeley, 
California. 
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Let N = 2 ix n; and Q; = n;/N. By 6 is meant the set of m unknown param- 
eters (0,, 02, °°: , Om). The p;; are assumed equal to f;;(@), where the f;; are 
continuous functions having continuous partial derivatives up to the second 
order. It is assumed that f;;(6) > 0 and 


(1) ¥ ful = |, 


9,,°°* , Om are assumed functionally independent in that for some m functions» 
fi; , the determinant 


Saya aon Jetum 


OF a8 
x 0; Jai —_ 30, 


SemBmt See Jeutum 


DEFINITION 1. 5(p, q) will be called a distance function of p = (pu,-** 5 Devs) 
and q = (qu,°°* 5 Qsv,) of it is such that 

(i) 6(p, p) = 0, 

(ii) 6(p, gq) > O for p ¥ q, 

(iii) 6(p, q) is continuous with continuous partial derivatives up to the second 
order. 

Letting pi; = fi;(@), the problem is to estimate the 6’s. Let 3,(q), 7 = 1, --- 
m, be functions of q which are estimates of 6,,---, 4m, respectively. Let, 


05(p, q)/00. = ¥(9, q). 


The following lemma is due to Werner Leimbacher, formerly of the Statistical 
Laboratory at the University of California. 

Lemma 1. Those values, 3,(q), functions of qi;,1 = 1,°°:,8,J = 1ss+, i, 
which minimize 6(p, q) are RBAN estimates of 6, t = 1, ++: , m, af &(p, q) is 
such that 

0°6(p, q) f isk 


(3 ——- = = (Q.—, 
0qi;09;, a=f ? Jtj 


d°5(p, q) ca iil AS fit finn 
4) 00:90, os 80 = -€2,0 u fs ’ 
where C is a constant. 

Proor. It is shown in [1], (Theorem 2, page 248), that for a statistic, 3;(q), 
function of gu,--* , ds»,, to be a RBAN estimate of 6, , it is sufficient that it 
satisfy the conditions 

(a) that 3:(q) have continuous partial derivatives with respect to all the 

independent variables, q;; , 
(b) that the result of substituting gi; = fi;(@1,---, Om), 2? = l,-:: 
j 1, ---,;, in 3,(q) leads to the identity 


wf= A, 





BEST ASYMPTOTICALLY NORMAL ESTIMATES 


(ec) that 


. 0d 
(6) — = 2s. Ax, 


04; Gas=Sas ps k=l 


where 


Gini Par Gis 
(8) «ts ti iinSin Gg, 


t=] j=l z fi; 


and Ay, is the cofactor of Gy . 

That A ~ 0 follows from the initial assumptiens on the f;,’s 

The proof of the lemma consists mainly in showing that condition (c) is 
satisfied when 6 satisfies equations (3) and (4). Suppose the equations y, = 0, 
k = 1,---, m, have been solved for the 6’s. (These solutions are the #,(q), 
t = 1,--- , m, which minimize 6(pq).) In order to find 03,/dq;; one substitutes 
the 3,’s into the y,’s and differentiates with respect to q;; . This results in equa- 
tions of the form 


9 - 
) O4i; t=] av, 04i; 
Solving for (d3,/0q;;), one gets 


Od 


10) = 
( 049i; q=f 


a), 
MH _ cq, fit, 
O*Fi; q=f Jij 


then 


/ Oot a 
(11) - = — fiz Mek 
Ogi; q=f fi A >. - " 


Conditions (a) and (b) are satisfied since 6(p, q) is assumed to be a distance 
function with continuous partial derivatives up to the second order. 
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3. A particular class of distance functions. 

DEFINITION 2. The symbol § denotes a class of distance functions satisfying the 
conditions of the previous lemma; i.e. 5(p, q) is said to belong to § if it is a distance 
function and if it satisfies the conditions (i) and (ii) as given by equations (3) and 
(4) above. 

THEOREM. If h(x) is a strictly monotonic function of x forO < x < 1 possessing 
continuous derivatives up to the third order and if the function g(u, v) is positive for 


0<u<1,0<v <1, has continuous partial derivatives up to the second order, 
and satisfies the condition 


(12) het | 92 r 


for all 1, j, then the function 


(13) i(p, q) = dn > off fis, aA) — AAT 


j=l 
belongs to class &. 
In other words, this theorem asserts that the functions 3,(q), ¢ = 1, +--+ , m, 
which minimize 6,(p, g) are RBAN estimates. 
ProoF. 6;(pq) is a distance function. This is apparent by inspection since h(x) 
is strictly monotonic and continuous and possesses continuous derivatives. Also 


= => ni > | ots, gis) (—2 2) ( (h(qi;) — h(fi;)) — = Six 


i=) 


i lei 
(14) : 


+ <8 I (a(qi;) —s rey 


Fb, — | dh dh |_- ik 
qi; 90; q=s " Sng fF sia 99) oe dx ref jj dx 295 j ~~ Oto Te’ 


a6; 2 | \ (4) ] 
_ ijy Qij f ijx ij 
0b; 06; q=f 23 > of oe , z=fij” f ' qs 


(16) 


Thus 6,;(pq) is a member of class §. 
Coro.uary 1. If the number of exclusive results which can be produced by the 
ith trial is 2,74 = 1, --- , 8, thenonecan put 


fa =fi, fe=1-fi, qa = Qi, da=1— q. 


If in addition h(x) = —h(1 — x), 0 < x <1, then &(p, q) reduces to the form 
8: (p, g) given by the equation 
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The implications of the theorem are that another class of estimates has been 
shown to consist of RBAN estimates. If computation is clumsy using other 
methods, perhaps the one which involves minimizing functions like 6,(p, q) is 
easy. Berkson’s short cut logit technique is one case of this. It, together with an 
example using probits, is given in the following paragraphs. 


4. Logit estimates. Assume that a sequence of s independent experiments is 
performed. The jth experiment censists in giving dose xz; of some drug to n; 
individuals. Each individual responds or fails to respond to the drug and the 
proportion responding, q;, is cbserved. Let the doses be 2, 22,--+-, 2, the 
number of individuals tested m; , n2, +++ , m, with }>}_1n; = N, the proportions 
responding (independent random variables) q; , g2, -:- , g» , and the probabilities 
of responding pi, po, °**, Ds- 


It is assumed that for 7 = 1,---,s 
(18) 1> p; =f; (a, 8) > 0. 


The f;(a, 8) are assumed to be continuous and to have continuous partial deriva- 
tives up to the second order. Also the parameters a and 8 are assumed inde- 
pendent in the sense that for at least two values of j, say, 7 and k, 

Jia tis 
(19) ~ 0, 

Sia fis 
where fia = Of;/da, fig = Of;/08. 

The short cut logit technique of estimation as described by Berkson, [2], be 

gins with the assumption that the f;(a, 8) are given by the logistic function, 


1 


(20) fila, 8) _ i+ e atbe;) » j _ 1, Te? ge 


Noting that log [f;/(1 — f;)] equals the linear expression, a + §z;, Berkson 

suggests for estimates of a and 8 the functions a(q) and b(q), respectively, which 

minimize 

(21) x1 = - njqi(l — qi (108 ; —_ log f ) 

, Se Ne mh 

These estimates are very simple to find and techniques have been developed to 

facilitate computation, (see [2]). The question has been raised, however, as to 

whether a(q) and b(q) are RBAN estimates of a and 8. Corollary 2 answers this. 
Coro.uary 2. Let p; be assumed equal to f; (a, 8), i = 1, +--+ , 8, where the f; 

are any functions of a and 8 which are continuous with continuous partial deriva- 


tives up to the second order and which satisfy the conditions (18) and (19). Then 
the function 


t=1 


xi = > niqi(l —= qi) (toe 9: 


1- qi 
ts a member of class &. 
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It follows immediately from this corollary that if fi(a, 8) = 1/[1 + @& ‘*"""} 
the values, a(q) and b(q), which minimize x4 are RBAN estimates of a and 8. 
Proor. Corollary 1 will be applied. Let h(x) = log[z/(1 — 2z)]. This has all 
the properties of h(x) in the theorem and in addition h(x) = —h(1 — x). Also 


a 1 
dz ms; fl — fi) 


Obviously 


aft 2a a ql — qi), (= gigi 

qi Le 
Also, g(fi, 9) = gi(1 — q:)*/q; satisfies the conditions imposed by the theorem; 
that is, g(u, v) is positive for 0 < u,v < 1, it has continuous partial derivatives 


up to the second order and it satisfies the condition 


; ~~ fii ust? 
22 = mt [Soadekeacins. 
622) oan j ; _) f i 


Substituting in x4 one gets 


xa = Denig(fi,q) — 9 — fi, 1 — gd) (agi) — hCG)’, 
i=] 

‘ . . o et os saa i 
which is the form of 6; (p, gq). Hence x4 is a member of class §. Note that in the 
above it is not necessary to write g(f; , gi) as a function of two variables when one 
argument does not appear. It is done merely to be consistent. with the notation 
of the theorem. 


5. Probit estimates. Next a probit method will be taken up as another ex- 
ample of this distance function notion. Suppose there exists a situation which is 
the same as that in the preceding example, only instead of the logistic function 
for p; it is assumed that p,; is given by the cumulative normal distribution func- 
tion 


l " 4((r—p)/o)? l 
ak f* poy 
V ord J_x Vv 


(23) 


say. A probit method of estimating » and o is outlined below. Define x as 
in (24), where ®' is the inverse function of @, that is, @’ is a function such 
that b'(@(x)) = x. Let 


(24) Xe = 2 niG(fi, gd(® '(q,) -- ® '(p,))°. 
i=) 
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G(fi, qi) will be defined below by (29). Also 


) eee 
(25) Ox 2 nGthivad (# qi) - = +) =o, 


Ou a 


(26) = = 2 > niG(fi, wo) (# ed ~ = a = 0, 
c o 


o 
Let these last two equations be simplified and written as 


(27) (SG, ae (a) + «(Yee a))- La. G(fi, qx: 


i=1 t=] 


a (= Q:G(fi, a)0™*(q)2) 
(283) * 
+ wi Gh, qe) — LAGS, gai = 0 
w=] 


and let G(f;, q:) be defined by the equation 


a (= )" = niga 
NJis qi) = gi(1 — qi) dp; Pi=Gi qi(l — gi)| aP(K) | 


(29) ' dx r=-1(9i) 


io e lean? ; 
g(l — qi) Tae : 


Since the coefficients of u and o in (27) and (28) are easily found, the solutions 
A(q) and 6(q) of (27) and (28) can be obtained. 4 and ¢ minimize x3 . 
Corotiary 3. Let x be given by 


. , 1 = ff 

2 nj —h(-1(¢,;))2 —1 -1 2 

(24) = —— pe” F (@ (qi) — ® (f;)) 
seg x a — qi) i 2a J . ; 


and let it be assumed that 


l sie A((p— ri- 
(23) f (pm, og= 77 = | is »)/0)? dr = ® ft F). 
: V 21 J_x o 


Then ws is a member of class & and it follows that &(q) and @(q) which minimize 
xz are RBAN estimates of uw and o. 

Proor. Again apply Corollary 1. @'(x) is seen to possess the properties 
h(x) of the theorem and is such that @ (2) = —® "(1 — x). As is seen imme- 
diately G(f;, qi) can be written as 


l ( | ie aor) 4 | ! 3, @-11-¢; 1 
/ € ia & 
(30) ai\V Jr 1 — qi Vv 2r 


= g(fi,gi) + gl — fi, 1 — qs), 


where g(u, v) satisfies the conditions of the theorem. It follows that x% can be 
. : * Ss . = 
put in the same form as 6; (p, q) and hence x% is a member of class F. 
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A practice in bio-assay has been to get maximum likelihood estimates of u 
and o and by a somewhat lengthy iterative process. Corollary 3 shows, however, 
that if the limiting situation in which N — « with Q; = n;/N held constant for 
all 7 is considered and if the asymptotic properties of the estimates are the 
criteria for the goodness of an estimate, then there is nothing to favor the maxi- 
mum likelihood estimates over the simpler ones derived above. 
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ON THE DISTRIBUTION OF THE EXPECTED VALUES 
OF THE ORDER STATISTICS’ 


By Wassity HoErrpinG 


University of North Carolina 


Summary. Let X,, X2, --- , X, be independent with a common distribu- 
tion function F(z) which has a finite mean, and let Z,; S Zn. S +++ S Zan 
be the ordered values X, , --- , X, . The distribution of the n values EZ, ,--~ , 
EZ,,, on the real line is studied for large n. In particular, it is shown that asn — ~, 
the corresponding distribution function converges to F(z) and any moment of 
that distribution converges to the corresponding moment of F(x) if the latter 
exists. The distribution of the values Ef(Z,m) for certain functions f(x) is also 
considered. 


1. Introduction and statement of results. Let X¥,, X2,---, X,,-°-- be 
mutually independent random variables with a common (cumulative) distribu- 
tion function F(x). Let Zn, S Zane S -+- S Zann be the ordered values X,, X2, 

- , X,. It will be assumed that 


(1) [ \zlar@ < oO, 


which implies that the expected values EZ) , EZn2 , «++ , EZnn exist. (Throughout 
this paper the statement that an expected value exists will imply that it is finite.) 
The distribution which assigns equal weights to the n values EZ, --- , EZnn 
will be referred to as the distribution of the £Z,,,, , and its distribution function 
will be denoted by F,,(xz). The primary object of this paper is to show that this 
distribution approximates the distribution represented by F(x) when n is large. 
More precisely, the following will be proved. 

THEOREM 1. Suppose that (1) is satisfied and let g(x) be a real-valued, continuous 
function such that : 


(2) | g(x) | < A(z), 


where the function h(x) is convex and 


(3) [ h(x) dF(2) < @. 


Then 


(4) we [ g(x) dF(z). 


no ML jul 


Received 8/12/52. 
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93 





94 WASSILY HOEFFDING 


The assumption that h(x) is convex is understood in the sense that for any two 
real numbers x, 


h(ax + (1 — a)y) S ah(x) + (i -—ajh(y) if0O <a <l. 

With g(x) = cos tz and sin tz, Theorem 1 implies that the characteristic func- 
tion of the distribution of the EZ, converges to that of X; asn — ~, and hence 
F(x) — F(x) forall points of continuity of F(x). With g(x) = x*,k > 0, we ob- 
tain that the moment of order k of the distribution of the EZ, converges to the 
corresponding moment of F(z) if the latter exists. 

If f(x) is a function such that Ef(X,) exists, we can, more generally, consider 
the distribution of Ef(Zn), --- , Ef(Zan). If f(x) is a strictly monotone function, 
Theorem 1 can be applied in an obvious v.. y. The general case will not be con- 
sidered, but the following special result will be obtained as a simple consequence 
of Theorem 1. 

THEOREM 2. Let f(x) be convex, g(x) conver and nondecreasing (for x = A if 
f(y) 2 A for all y), and suppose that 


/ x dF (zx), | f(x) dF (x) and / g(f(x)) dF (x) 


exist. Then 


n 


lim ; Dd g(Ef(Zn5)) = g(f(x)) dF (x). 


ne Ml jp=l 


Theorem 2 and the indicated modification of Theorem 1 apply, in particular, 
to the case where f(x) and g(x) are powers of z. 

The behavior of the distributions of the #Z,, and the Ef(Z,m) is of interest 
in connection with certain rank order tests. It has been shown by Hoeffding 
[4] and Terry [6] that rank order tests for testing a hypothesis of randomness 
which are most powerful against certain alternatives are based on statistics of 
the form c(R) = p aj;Ef(Zne;), Where R = (R,,---,R,) is the vector of 
the ranks of the observations and f(x) is a given function. If all permutations of 
the ranks are equally probable, the moments of c(R) are functions of the power 
sums ot [Ef(Z,;)|. Theorems 1 and 2 give asymptotic expressions for these 
power sums. Tests of this type were already considered by Fisher and Yates 
[2] whose tables XX and XXI give the values of /Z,,; and the (approximate) 
values of >it (EZ,,;) for n < 50 when F(x) is normal with mean 0 and vari- 
ance 1. Dwass [1] and Terry [6] use results implied by Theorems | and 2 to 
study the asymptotic distributions of statistics of the form c(R). 


2. Preliminaries. The distribution function of Z,, will be denoted by F,,,(x). 
Since Z,» < x if and only if at least m of the values X,,--- , X, are Sx, we 
have 


Fam(x) > (") F(x)*{1 — F(a)|""/ 


j=m 


(5) 


' F(z) 

nN. me n—m 

= ——* ____. | (1 — oo" dt 
(m — 1)'(n — m)! Jo 





DISTRIBUTION OF EXPECTED VALUES 95 


The following three facts, which are known or easily verified, will be used in 
the sequel. 
1. If Ef(X,) exists, so does Ef(Z,».) for all n, m. 
I. Dory Ef(Zam) = nEf(X)). 
II. (Cf. Jensen [5].) If h(x) is convex and U is a random variable such that 
EU and Eh(U) exist, we have h(EU) Ss Eh(U). 
tepeated use will be made of the following Lemma 1, which is an immediate 
consequence of an extension by Fréchet and Shohat [3} of a theorem of Helly. 
LemMa |. Let V(x), V,(x), n = 1,2, --+ , be a sequence of functions which are 
uniformly bounded and of uniformly bounded variation on any finite interval, such 
that limy+«. V»(z) = V(x) for all x, with the possible exception of a countable set. 
Let f(x) be a continuous function such that 


[ s@ dV (x) and [ f(x) dV,(x), 


<2 


exist and 


- 


lim | f(x) dV,(x) = 0 
|z|>A 


Ax 


uniformly with respect to n. Then 


lim [ f(x) dV,(x) = | f(x) dV(z). 


no 


3. Proofs. Theorem 1 will be proved with the help of several lemmas. 
LemMMA 2. Given « > 0, there exist two numbers C and a, where 0 < a <1, 
such that for everyn 2 2 


(a , yn ° . m 
(6) Fin(x) S Ca" F(z) if Fix) +e S — 
tl 


1— F(t) <C[lL-F@) ¢ os @—! 


n=} 


Proor. Let s = (m — 1)/(m — 1), v = F(x), 


[ [‘*a — 0) "‘)"" at 
0 

1 7 — . 
[ iri -ty TT a@& 


“0 


H(s,v) = 


Then inequalities (6) and (7) can be written as 
(8) H(s,v) S Ca"v ifetes 
(9) 1 — H(s,v) S Ca"(1 — v) foOcs v €. 


l—s - ° 
nereases for 


For s arbitrarily fixed, 0 < s < 1, the function ((1 — ¢) 


0 < ¢t < sand decreases for s < t < 1. Hence the quantity 
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2b = min [s‘(1 — s)"* — (s — &)*(1 —s + 6)" "J, 


esscl 


where s‘(1 — s)"* = 1 if s = 0 or 1, is positive. We have forv < s — e€ 


| (‘a -—o' yd sls - oO) —-st+e Jv 
0 


(10) 
< [s'(1 — s)"* — 2b]"*v. 
On the other hand, we can choose a positive number d so that for 
every 8,0 Ss Sl, 
s(l1—s)*-—¢(1 —0'" Sb if|t—s|<d. 


Then we have 


1 
[ [“(1 ae i ali dt > / (“1 nal oa dt 
(11) ' estan 


> die’(i — es)" — of. 
From (10) and (11) we have forv +¢€ Ss 1 
(12) H(s,v) S$ d‘[K(s)|""», 
where 
‘ s‘(1 — s)"* — 2b U1 — 2b 
13 K(s) = — stoniumaaane’ St Gan 
(13) (s) s(l—s)'"*—b~ 1—b 
If we put a = (1 — 2b)/(1 — b) and C = d'‘a‘“, inequality (8) follows from 
(12) and (13). 
Inequality (9) is obtained from (8) by observing that 1 — H(s,v) = 
H(1 — s, 1 — v). This completes the proof. 
The following Lemmas 3 and 4 are immediately obtained from Lemma 2. 
Lemma 3. Jf m/n > cas n— &, then 
0 if F(x) <e 
lim Fam(z) = 4 
ane \1 if F(x) > c. 
Lemma 4. Jf m/n > cas n— ~, where 0 < c < 1, there exist two numbers 
N and d > 0 such that forn > N 


Fum(x) & F(x) if F(x) < d, 

1 — Fan(x) S 1 — F(a) if 1 — F(x) <d. 

Let S be the set on the real line which consists of all points of discontinuity of 

F(x) and all points x such that F(x — h) < F(x) < F(x + h) for every h > 0. 

Lemma 5. Let ye S,0 <a < 1. 1f m/n > aF(y — 0) + (1 — a)F(y + O) 
as n — , then 


(14) lim EZam = Y- 


n--20 
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Proor. By Lemma | it suffices to show that 
(0 if r<y 

(15) lim Fan(z) = 4 = 
n= d = 2>¥y, 


and that 


(16) lim xz dF,(z) = 0 uniformly with respect to n. 


Aw 4|z|>A 


Letc = aF(y — 0) + (1 — a)F(y + 0). Since y € S, the inequalities z < y < z 
imply F(x) < c < F(z). Hence (15) follows from Lemma 3. 

The assumptions y « S,0 <a < 1 imply that 0 < c < 1. Let d and N 
be defined as in Lemma 4. Given e > 0, choose B > 0 so that F(—B) < d, 
1 — F(B) < d, 


—B x 
-| adF(z) <£, i adF(z) <£, 
— 8D 2 B z 
and F(x) and F,,,(x) are continuous at z = +B. Then 


—B —B 
(17) -/ 2 dF um(2) = BF um(—B) + / Fam(x) dz. 


Applying Lemma 4, we have that for n > N the right-hand side of (17) does 
not exceed 


-B —B 
BF(—B) + | F(a) é « -| 2 dF(z). 


—B —B 
Hence if n > N, -| ZAPF ym(x) < €/2 and, similarly [ LAF nm(x) < €/2. 


This implies (16). The proof is complete. 
Let 


(18) G.{s) , > F,,(2). 
+0 jem] 


Lemma 6. If m/n > casn— ©, then 


| F(x) if F(x) <e 
lim Gam(z) = 4 ae 
_—— c of F(x)>¢. 
Proor. By (5) and (18), 


NG nm(2) = > 7 (") F(x)‘ 1 oo F(x)|"“ 


J=l k=j 


=> t (7) F(x)‘ (1 — F@)|"* +m > ("") F(x)‘ {1 — F(x)}"™, 
k=1 v . 


k=m+1 
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whence 
y , I ’ ee 
(19) Gan (x) F(z){1 — Paoam(2)] + — Fra.mii(2) 
n 


and G,,.(«) = F(x). Lemma 6 now follows from Lemma 3. 

From (19) and Lemma 4 we easily obtain 

LemMa 7. If m/n > cas n— ~”, where 0 < c <1, there exist two numbers 
N and d > 0 such that forn > N 


Gan(x) S 2F (x) if F(x) < d, 


m nial a ig ‘ 
— G(x) S 1 — F(a) uf 1 — F(x) < d. 
n 

LemMa 8. If g(x) satisfies the conditions of Theorem 1 and m/n — F(y) as 
n— «©, where y is a point of continuity of F(x), then 


(20) lim 2 oe Eg(Z,;) = / g(x) dF (x). 


ne Ml jel J—w 


Proor. Equation (20) can be written in the form 


[ g(x) dF(a). 


“—2 


(21) lim | g(x) dGnm (x) 


n--x 
By Lemma | it suffices to show that 
F(x) 
(22) lim Gran(x) { 
-— F(y) 


for every x at which F(x) is continuous and that 


(23) lim | g(x) dGr»(x) = 0 uniformly with respect to 7. 
zt|>A 


4—oo 


For every y which is a point of continuity of F(x) we can choose two numbers 
yi, yz in S and two numbers a , @ in (0, 1) such that if we let 


ec; = a,F(y; — 0) + (1 — a) F(y: + 9), 


we have c; S F(y) S c and ec: — ¢ is arbitrarily small. Now ch 
and m; = m in such a way that m,/n — c¢, and mn —> ¢ 
Since Gam, (@) S Gam(@) FS Gnam,(x), (22) now follows from Lemma 6. 


To prove (23), we may assume without loss of generality that the function 
h(x) of Theorem 1 is nonincreasing for —2 sufficiently large and nondecreasing 
for x sufficiently large. Then (23) follows from 


- 


[atz) dante) | [— Wle) dGan(s) 
Zi>A 


Viz > “ 


and Lemma 7 in a similar way as in the proof of (16). This completes the proof 
of Lemma 8. 
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Sip =>. So Re, 


NM EZnjcy 
H(y) = /[ x dF(zx). 


LemMa 9. If y is a point of continuity of F(x), lim... H,(y) = H(y). 
Proor. We can write H,(y) =n" > e1 EZ,,; , where m = m(y) is deter- 
mined by 


(24) Dian & YX Bbaws- 


This implies m/n — F(y). For otherwise a subsequence {m’/n’} of {m/n} 
must converge to a number v + F(y). If v < F(y), we can choose z ¢ S and a 
in (0, 1) so that v S ec < F(y), wherec = aF(x — 0) + (1 — a)F(x + 0). To 
every (m’,n’) we can choose an integer m” = m' so that m” /n’ > c. By Lemma 
5 this implies z = limy. EZn-.mer41, hence lim sup EZy m4. S 2 < y, which 
contradicts (24). In a similar way the assumption v > F(y) leads to a contra- 
diction. 

Lemma 9 now follows from Lemma 8 with g(x) = z. 

Lema 10. Jf g(x) satisfies the conditions of Theorem 1, we have 


lim | g(x) dF (x) = 0 
A~x z]>A 
uniformly with respect to n. 

Proor. If A is a point of continuity of F(z), 


| g(x) dF,(x) | h(x) dF,(x) = : >> h(EZ,,), 

A A N jmm 

where EZ,,.n-1 S A < EZ, . As shown in the proof of Lemma 9, m/n — F(A) 
as n — x. Since A(x) is convex, n7 es h(EZ,;) sn Aaa Eh(Z,,;). By 


~ 
Lemma 8 the right-hand side converges tof h(x) dF(x). Thus we obtain 
A 


an upper bound which can be made arbitrarily small and is independent of n. 
The remainder of the proof is obvious. 

Proor oF THEOREM 1. Equation (4), which is to be proved, can be written 
in the form 


no '—0o 


x 
(25) lim [ g(x) dF,,(2x) / g(x) dF (x), 
—~ 


and this is equivalent to 


; * g(x) — g(0 * g(x) — g(0) 
(26) lim [ — [ ot dif(z). 


nn ¥—x x J—x al 
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First, suppose that the function (g(z) — g(0))/z is continuous everywhere. 
Then (26), and hence (25), follows from Lemmas 9 and 10 by using Lemma 1. 
In particular, (25) is now proved for g(x) = cos tx and sin tz. By the continuity 
theorem for characteristic functions this implies that 
(27) lim F,(z) = F(z) 
for all points of continuity of F(x). Equation (25) now follows for every g(x) 
which satisfies the conditions of Theorem 1 by applying Lemma 1, (27) and 
Lemma 10. 

Proor oF THEOREM 2. Since f(x) and g(x) are convex, we have f(EZ,,;) 
Ef(Z,;) and g(Ef(Z,;)) = Eg(f(Z,;)). Since g(x) is nondecreasing, g(f(EZ,.;)) 
g(Ef(Z,;)). Hence 


Pie fee ~ ee ea ig ee ” 
(28) | LH f(EZnj)) < - Di g(Ff(Z,i)) S ~ 2, Eg f(Zni)) - | g(f(x)) dF(z). 
I= —2 


1 
j=l TL jak 


The first member of (28) converges to the last member if the function g(x) = 
g(f{(x)) satisfies the conditions for g(x) in Theorem 1. That these conditions are 
satisfied, follows from the fact that g(r) is convex. 
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ON THE THEORY OF SYSiEMATIC SAMPLING, III. COMPARISON 
OF CENTERED AND RANDOM START SYSTEMATIC SAMPLING' 


By Wiiuram G. Mapow 


University of Illinois 


1. Summary. The main result obtained is the following: If a population has 
monotone decreasing correlogram, then centered systematic sampling is more 
efficient than random start systematic sampling. It is also shown that if a 
population is monotonic, then centered systematic sampling is more efficient 
than random start systematic sampling, but here it is easy to cite cases in which 
stratified random sampling is more efficient than either. Thus, centered system- 
atic sampling is more efficient than random start systematic sampling, in the 
conditions (namely, concave upwards and decreasing correlogram) in which 
Cochran [1] proved that random start systematic sampling is more efficient than 
stratified random sampling. 


2. Introduction, Types of Sampling Considered. In this paper, we discuss the 
theory of centered systematic sampling technique. As is well known, this tech- 
nique of selecting samples has long been of practical importance. The theory of 
centered systematic sampling should also be valid for random start systematic 
sampling with end-corrections (see Yates [5]) since the latter technique in effect 
reduces random start systematic sampling to centered systematic sampling. 

Inasmuch as the approach used in the demonstrations follows that of earlier 
papers by Cochran [1] and the present author [3], [4], notation and proofs are 
presented in condensed form. 

The elements of the population are 2, , 22, +++ , y where N = kn. The ob- 
jective is to estimate Z, the arithmetic mean of the population, on the basis of 
a sample of size n. 

The random start systematic sampling estimate, Z,, , is the arithmetic mean 
of the n elements obtained by selecting one element by an equal probability 
selection method from x; , --- , 2, and including in the sample every kth element 
thereafter. The arithmetic means of these k possible samples are denoted by 
i, +++, %, where Z; is the mean of the sample whose first element is z, . The 
variance of Z,, is denoted by o;, expressed in terms of the elements of the popu- 
lation. 

If k is odd, the centered systematic sampling estimate, Z. , is Zu41/2 and if k 
is even we arbitrarily define, 7. = 2. (Actually, if k is even, one might either 
select k/2 or (k + 2)/2 at random, or one might designate other patterns for se- 
lecting the sample elements instead of, as above, designating the elements 2;,2 , 
Tye, ***, Une2- For example, 24/2, Tee+2)/2, Lae/2, Lae+y/2, °°* Would be prefer- 
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able in a monotone population. For our present purposes, it is not important to 
try to determine the best pattern.) The mean square of ¥, about € is denoted by 


Os 

In stratified random sampling we consider 214,¢j-1,, T24G-ye,°** » Lx to 
constitute a stratum, 7 = 1, --- , n. Hence, there are n strata each consisting of 
k elements. We suppose that one element is selected from each of the n strata 
by an equal probability selection method. The sample mean is denoted by 
¥., and the variance of £,, is denoted by de expressed in terms of the elements of 
the population. 

We use F to denote the taking of the expected value when the elements of the 
population are considered to be constants and & to denote the taking of the 
expected value when the elements of the population are considered to be random 
variables. 

Inasmuch as we shall be using the word correlogram somewhat loosely in the 
following, the word is now discussed. If 7, --- , ty is an ordered sequence of 
random variables, if p; is the correlation coefficient of two random variables 
whose subscripts differ by 6 (e.g., p2 = o2,2,/¢2,0z,, and if the correlation is to 
depend only on 6, then the function f(6) = ps,6 = 1, --- , N — 1, is often called 
the correlogram of the sequence. It is usually assumed that the random variables 
have identical mean values and identical variances. However, when we use the 
word correlogram in the following, it will refer only to the expected value of the 
product 2,2, which we will assume to depend only on 6 = | 7 — h| . Thus, if the 
random variables have identical mean values our statements refer to the usual 
correlogram but otherwise the condition we state does not assume the identity 
of the mean values of the random variables. 


3. Monotone populations. Hotelling and Solomons [2] proved that for any 
quantities 2, 22, °°: , 2), the following inequality is valid 


21 g(median — arithmetic mean)? 
(3.1) Ne 


if all terms are finite and the denominator does not vanish, where, if g is odd 
the median has the usual definition, and if g is even, and 2’ and 2” are the two 
central quantities, then the median can be any quantity such that 2’ < median 
< 2”. (The details of their proof are given only for odd g but follow at once for 
even g.) 


If a population is monotone, then either 7, S 7, > 


i. >--- => &. Hence, if k is odd, Z, is the median of #,,--- , Z and, if k is 
even, Z. is such that t/2 S & S Fe42/2 OY Fee 2 Fo S> FeE42/2. Then (3.1) be- 
comes 


(3.2) 
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Since (. — #)’ is the mean square error of Z, about when the elements of the 
population are not random variables, we have proved: 

TuHeoreM 1. Jf the population is monotone, then centered systematic sampling is 
more efficient than random start systematic sampling. 

(f course, if a population is monotone, and if the size of sample is sufficiently 
large, then stratified random sampling may be more efficient than centered 
systematic sampling, since the latter estimate may have a bias that does not 
tend to zero sufficiently rapidly as the size of sample increases. 

In practise, however, even if a population is monotone, centered systematic 
sampling will often be more efficient than stratified random sampling. To see 
how this will occur let us define the average variance and average covariance 
terms of a; . 

Let 


k 
~ ] 
j= i > Vi+(j-1)k » 


v t=] 
Then o2 = S + C, where, 


1 n 
= \2 
S = (2a4(j-1)k — £3) ’ 
N° j=l 


I< . ‘ 
C — = (%a4G—wk - 25) (ta+(m—1)k — Zu), 
” j,m=1 
jzm 
anda = (k + 1) 2if kis odd,a = k/2if kis even. We call S the average vari- 
ance term and C the average covariance term of o- . 
By the result of Hotelling and Solomons 


(a+G—1k a 2,)° Ss E (i+ Gj—1e aie Z;), j bias l, oe 5 


Hence S S oy, . Thus, if C < o3, — S then o. < o;,. In practise, the average 
covariance term, C, is often small enough for the above condition on C to be 
satisfied. 


4. Populations with monotone decreasing correlograms. (Actually, it is terms 
such as (4.1) below that will be assumed to be monotone decreasing.) 

We will need the following notation in this section. Unless specific limits are 
stated, the letters 7, h will assume all integral values from 1 through k; the letters 
7, m will assume all integral values from 1 through n; the letter y will assume all 
integral values from 1 through n — 1; the letter 6 will assume all integral values 
from 1 through k — 1; and the letter e will assume all integral values from 1 
through 3(k — 1). (In the proof k is assumed to be odd. The case where k is even 
introduces further complications and notation without altering the basic re- 
sults.) We now suppose that the elements of the population are random vari- 
ables, and let 
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(4.1) EX a4(j-1kCB+(m—-1)k = M(m—j)k+5 » 


where 6 = 8 — aandj S m. Thus, —(k — 1) $6 S (K-11). 
THeorEM 2. Under the conditions stated 


n— 


1 
(4.2 E0%, — 60; = ome Die (uykte — Meyeik—e) 
mks” ya 
Tf wr 2 wo 2 +++ S} pne-s and the inequality holds at least once, then centered 
systematic sampling is more efficient than random start systematic sampling, while 
Of wr S we S +++ S wax and the inequality holds at least once, then the contrary is 
true. 

Before proving Theorem 2 let us consider some of its implications. Actually 
from (4.2) it follows that &o%, = 602 if the elements of the population have the 
same expected product (4.1) no matter how distant they are, that is if u4. = we = 

- = une-1. If we assume all elements of the population have the same ex- 
pected value then the above statement is made for the serial covariance rather 
than the expected product. For example, if the elements of the population have 
the same expected values and are uncorrelated, then &03, = 60. 

Furthermore the conditions stated above under which centered systematic 
sampling is less efficient than random start systematic sampling should almost 
never be satisfied in practise. In practise, however, irregularities of the cor- 
relogram may well lead to the greater efficiency of random start systematic 
sampling as compared with centered systematic sampling. 

Proor. The demonstration of (4.2) is tedious, but not difficult. Let us begin 
by obtaining the following two lemmas. 

Lemma 1. Jf f(¢ — h) is a function of the difference of the integers i and h, then 


(4.3) > fi — h) = kf) + S (k — Alf) + f(—8)]. 
6 


th 


Also, 


(4.4) D— Ml i —hl) = kfO) +2 > (k — dfs). 
th 6 


The proof is omitted. 
Lemma 2. Leté = |i —h|. Then, ift #h 


oe ao 9 
(4.5) 62,5, = - 4 = > (ns — vines + weal, 
Y 


and 


: 2 
(4.6) oH = + = > (n — ya. 
7 


i] 
i 


Proor. We now denote 2;4,;-1y% by 2;;. Since #; = (1/n) >> 24, it follows 
that , 
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2 btuita + = D bjitm + 5 Dd. ER jiLmr 


2 jem j>m 


_ ms + z- 5 + Myk— i}. 


Thus (4.5) is proved. Then (4.6) is a special case. 
We return to the proof of Theorem 2. Now, putting 


2 


A = E01, — &0: 
it follows that 


1 -2 2 dele! 9 
~< > 6%; — 8%. + 26%.% — 2EX. 


Since, from Lemma 2, Sz 


9 
‘ 


is independent of 7, it follows that 
A=2EL¢ 


Now, from (4.4) and (4.5), teking 7 = e and averaging over h, it follows that 
‘ 


— ) 9 _ 
—— \ Mo + 2 fiers ‘> + =f  m Me + Z = n z [we +e + Myk—d ? 


nk | th \ ¢ ¥ ) 


lS 8g:4:, 
~ Cn 


and, since by Lemma 2, &7;%, depends only on | i — h|, it follows from Lemma 


1, that 


\ 
on + x <7 (ures + wn) | >. 


Then 


4 
A = — De la = Ure 


nes k 


} n ¥ € 
+ ake 7 canara y ie [oyk+e + Myk—e — M(y+1)k—€ — M(y—1)k+el- 
uk \ 


Y l € 


7 (n — V) [Myke = My—nk+el = —nu. + Spates 


¥ =() 


Zz (n " y)[uye—e ~ Mey41yk—el = Wis. = TE Riecitas €* 
y 


y=9 
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Hence 


n—-y¥ € 
Z - Ze I: [Myk-+e + Myk—e —~ Miy+1)k—-e Biy-k+el 
€ v 


7 n 


1 n—1 
= -> ; (ue _ Mk—e) + n z p> (Myk+e oa My+1k—e) 


Thus, (4.2) is proved. Since 1 S ¢ S (k — 1)/2, it follows that wyy. 2 wey +e 
if wy, = wo = -+- = wnar-1- Hence A = O if the correlogram is monotone decreas- 
ing and A > O if the correlogram is monotone decreasing and not constant. 


5. Comments. It is easy to extend the results of this paper to two-dimensional 
statistical sampling and to the sampling of clusters. These topics will be dis- 
cussed in following papers, 

It is interesting to note that if 8x74). is not assumed to be independent of 
j and 7 then the above results will not hold without further assumptions con- 
cerning the terms &2ri4,j;—1e - 
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A MODIFICATION OF SCHWARZ’S INEQUALITY WITH 
APPLICATIONS TO DISTRIBUTIONS' 


By Siceit1 Moricvti 


University of North Carolina and University of Tokyo 


Summary. Theorem | provides us with a result from which ‘ve can derive the 
modified Schwarz inequality (1.2) or, more generally, (3.8). The formulas hold 
when z(t) is any nondecreasing function belonging to a certain wide class, and 
g(t) is the right-hand derivative of the ‘‘greatest convex minorant” of &(t). The 
necessary and sufficient conditions for equality to hold are also given. Applica- 
tions to distribution problems in statistics are discussed in Section 4. 


1. Introduction. In a previous paper [1], the author made use of Schwarz’s 
inequality in obtaining the least upper bound of an integral of the form 


b 
(1.1) x(t)¢(t) dt, 


where x(t) is a variable function and ¢g(t) is a given function. However, if the 
domain of variation of x(t) is limited to the class of nondecreasing functions, then 
Schwarz’s inequality does not give the least upper bound unless ¢(t) itself is 
effectively a nondecreasing function, because the equality holds only if x(t) is 
effectively proportional to g(t). 

In the present paper, we shall derive a modified Schwarz inequality 


b ( pb , fr ae 
(1.2) / x(t)¢(t) dt si x(t)” dt> | g(t) dt> , 


where g(t) is a nondecreasing function closely related to ¢(t). It holds for any 
nondecreasing function z(t) and for any function ¢(¢) in a certain wide class, the 
equality being satisfied if 2(t) = A@g(t) holds with nonnegative constant coeffi- 
cient A almost everywhere. 

The relation between ¢(t) and g(t) is most simply stated by means of the con- 
cept of the “greatest convex minorant,”’ whose definition and some of whose 
properties we shall state in the next section. 

The inequality (1.2) can profitably be generalized to (3.8), which reduces to 
(1.2) if (t) is absolutely continuous and #’(t) = g(t). 


2. Greatest convex minorant. See for instance ({2] p. 440) and ([3] pp. 91, 94). 
For a given function ®(¢) in a closed interval [a, b], we consider the supremum of 
all convex functions dominated by #(¢) in the whole interval. (A convex function 
is characterized by the fact that the middle point of any chord of its graph 


Received 3/13/52, revised 9/24/52. 


1 Presented at the Blacksburg meeting of the Institute of Mathematical Statistics, March 
19-21, 1952. 
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lies above or on the curve.) Let it be denoted by #(t) and be called the greatest 
convex minorant of &(t) in the interval [a, b]. It is easy to see that #(t) itself is a 
convex function dominated by ®(t). 

In the following we consider only the case where #(¢) is a function of bounded 
variation in [a, 6] and continuous at both ends. Then its greatest convex minorant 
(t) is bounded above; hence, it is continuous (see the references). For every ¢ in 
[a, b], &(t) S H(t). The equality holds certainly at the end points because of the 
assumed continuity of (t) there. The set of values of ¢ for which @(¢) = min 
{b(t — 0), b(t + O)} is a closed set (due to the definition of &(¢)). Therefore its 
complementary set, that is the set of values of t for which ®(t) < min {#(t — 0), 
#(t + 0)}, is an open set and consequently either empty or the union of a de- 
numerable number of disjunct open intervals. In each of those intervals, if any, 
#(t) is a linear function, as follows from the definition of &(t). 

As a continuous convex function, (¢) has everywhere (except perhaps at the 
end points) finite left-hand and right-hand derivatives; the left-hand derivative 
is not greater than the right-hand derivative at the same point. Let us denote, 
for the sake of definiteness, the right-hand derivative of #(¢) by a(t). Then g(t) 
is of course a nondecreasing function and is consequently continuous except 
perhaps for a denumerable set of values of ¢. a(t) is a constant in any interval 
where &(t) < min {@(¢ — 0), &(t + 0)}. 


3. Modified Schwarz inequality. 
THEOREM 1. Let @(t) be a function of bounded variation in the closed interval 
[a, b] and continuous at both ends. Then the relation 


b b 
(3.1) / x(t) de(t) < / x(t)e(t) dt 


a a 


holds for any nondecreasing function x(t) for which the integrals exist and are 
finite, where @(t) is the right-hand derivative of the greatest conver minorant ®(t) 
of &(t). The equality in (3.1) holds if and only if x(t) is a constant in every in- 
terval where 


(3.2) #(t) < min {&(t — 0), d(¢ + 0)} 

and, at every point of discontinuity, if any, of ®(t), 

x(t,) = x(tn + 0) when &(t, — 0) < H(t, + 0), 
x(t, — 0) nhen &(t, — 0) > &(t, + 0). 


(3.3) 


Proor. Let us first assume that x(t) is bounded in [a, b] and hence of bounded 
variation. So is (tf). Therefore we can apply the formula of integration by 
parts (See [4] for a formula which can be adapted to our purpose with a slight 
modification.) and get 


b b 
(3.4) / x(t + 0) d&(t) = [x()e(o Pt} — / &(t + 0) dr(t), 


a 
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where, at every point of discontinuity, if any, of #(t), we take the smaller of 
#(t — 0) and &(t + 0) and decide the sign correspondingly. (We take for instance 
a(t + 0) and &(t — 0) if b(t — 0) < &(¢ + O).) Similarly, as #(¢) is continuous, 
we get 


b b 
(3.5) [ x(te(t) dt = [x()d()+3 — / &(t) dr(t). 


Thus we have 


b b b 
(3.6) [ 2(a(t) dt — | x(t F 0) d6(t) = / (o(t = 0) — B(t)} de(t), 
the first term on the right-hand side of (3.4) and (3.5) cancelling because $(/) = 
#(t) at both ends. 

Now, as stated in the previous section, {@(t + 0) — (t)} in (3.6) vanishes 
except perhaps on a set consisting of a denumerable number of open intervals, 
where it is positive. Therefore, (3.6) is nonnegative and vanishes if and only if 
x(t) is constant in every such interval. 

The difference between the left-hand member of (3.1) and the second term 
of the left-hand member of (3.6) can come only from the contributions at the 
points of discontinuity of (t). Thus 

b 


b 
(3.7) i] x(t $ 0) det) — / x(t) d&(1) 


= >. {x(tn $0) —x(t,)} {O(t, + 0) — H(t, —0)}, 


summation taken over all points of discontinuity of (1). Since z(t, — 0) S 


x(t,) S x (t, + 0), and in view of the above-mentioned choice as to the double 
sign, it is clear that each term is positive unless z(t,) = z(t, + 0). Hence, (3.7) 
is nonnegative and vanishes if and only if (3.3) holds at every ¢, . 

Adding (3.6) and (3.7), we get an equation expressing the difference between 
the two members of (3.1) as the sum of the two nonnegative quantities. Hence 
(3.1) holds. In order that the equality in (3.1) hold, it is necessary and sufficient 
that both (3.6) and (3.7) vanish. Thus the theorem is proved for bounded z(t). 

When z(t) is not bounded in [a, 5], we can still derive (3.6) by taking the limit 
of a sequence of similar formulas for narrower intervals. The rest of the proof 
does not need any change. 

Coro.uary. Under the same assumptions, the relation 


(3.8) [ z(t) d&(t) < (f z(t)? ai if 2(t)? at 


holds for any nondecreasing function x(t) such that the integrals including x(t) exist 
and are finite. If 2(t)* is summable in (a, b) and if g(t) is not identically equal to 
zero in (a, b), then necessary and sufficient conditions for the equality in (3.8) to 
hold are that x(t) = Ag(t) almost everywhere (with nonnegative constant coefficient 
A) and that x(t) satisfy (3.3) at every point of discontinuity, if any, of P(t). 





110 SIGEITI MORIGUTI 


Proor. The ordinary Schwarz inequality applied to the right-hand member 
of (3.1) leads us to the inequality (3.8). 

We note that (3.8) may be called a modified Schwarz inequality. 

As a particular case of (3.1), the inequality 


b b b 
(3.9) | x()e(t) dt > —— / x(t) dt y(t) dt 


a b—aJda 


can be derived for any nondecreasing functions x(t) and ¢(t), both summable in 
(a, b). This is Tchebycheff’s inequality (See [3] p. 168, Theorem 236). Inciden- 
tally, a change of variables leads us from (3.9) to a formula which implies the 
one on p. 601 of [11]. 

THEOREM 2. Let ®(t) be a function of bounded variation in [a, b| and continu- 
ous at both ends. Let moreover its value be the same at both ends. Then for any non- 
decreasing function x(t) belonging to L2(a, b) and summable with respect to ®, 


[ {a(t) — C}? a} a(t)? a, 


a 


b 
(3.10) [ 2 a@ = | 
where C is any constant and Q(t) is the right-hand derivative of the greatest convex 
minorant ®(t) of &(t). If o(t) belongs to L2(a, b) and is not identically equal to zero 
in (a, b), then the equality in (3.10) holds if and only if x(t) — C = Ag(t) almost 
everywhere (with nonnegative coefficient A) and (3.3) hold at every point of discon- 
tinuity, if any, of P(t). 

Proor. The integral on the left-hand side of (3.10) does not change its value 
if one replaces x(t) by x(t) — C. Therefore we get (3.10) from (3.8). 

We note in particular, that if C is chosen as the mean value of x(t) in (a, b), 
the first factor on the right-hand side of (3.10) reduces to the standard de- 
viation. 

4. Applications to distributions. Let us consider the inverse function (Cf. [3} 
pp. 152-3, [5] p. 189.) z(F) of the cumulative distribution function F(x). Then 
for instance x(a) may be called the a-quantile. z(F) is a nondecreasing function 
of F in (0, 1). We assume that it belongs to L,(0, 1). 

EXAMPLE 1. The distance between the a-quantile and the mean EX is given by 


1 
(4.1) a. / 2(F) d®(F), 
0 


where 
(4.2) $(F) . —F ane 0 
=l1—k when a 


Theorem 2 is applicable here, leading us to 
ae ( nes ap - 
(4.3) a. yf a < x(a) E 


a = o 
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We note that (4.3) has been derived and used by Hoeffding [6]. It is equiva- 
lent to the inequality (cf. [7]) 
b° 


14 (X < EX slain 
(4.4) Pr {X S$ EX + bo] 2 E> 


(b > 0). 


In the special case where a = 3, (4.3) tells us that the distance between the 
median and the mean can never exceed the standard deviation (cf. [8]). 

EXAMPLE 2. The distance between the expected value of the i-th smallest 
member in the sample and the population mean is given by 


1 

(4.5) EX; — EX = / 2(F)¢(F) dF, 
0 

where 


> a n! i-1 a i 
(4.6) come cet 87 ot 


Here again Theorem 2 is applicable. If 7 equals neither 1 nor n, then the func- 
tion @(F) turns out to be 


o(F) = o(F) when 0 S F < Fy, 
= 9(F2) when F; < F <1, 


(4.7) 


where F, is determined by 


1 > 
(4.8) (1 — Fie(F:) = / oAP)dF, 0<F <F, =i. 
Fe n-—1l 


The integral on the right-hand side can be evaluated by means of incomplete 
Beta function. If n is not very large, the tables of the incomplete Beta function 
{9] will be satisfactory. Then we can compute the least upper bound of (4.5) for 
given population standard deviation. The bound is actually attained for a par- 
ticular distribution z(F) = Ag(F). This is a mixed distribution with concen- 
trated probability on the value x = Ag(F2) and distributed probability on the 
interval (—A, Ag(F2)). 

For i = n (i.e. for the largest member), g(F) is monotone increasing. Conse- 
quently, the ordinary Schwarz inequality will suffice to get the least upper 
bound (n — 1)/2n — 1-e. (This is naturally a larger bound thaa the for- 
mula (3.6) of the previous paper [1] which is applicable only to symmetrical 
populations.) The bound is actually attained for a particular distribution 
x(F) = Ag(F). 

A few numerical results for the sample median are shown in Table 1. Also 
shown are the results given by ordinary Schwarz inequality. This illustrates the 
improvement obtained by the present method. 

EXAMPLE 3. For z(1 — 8) — x(a) we get 
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(F) 
(4.9) 


and finally 


(4.10) z(1 — 8) — 2a) s 3 - 5 a 


We note that by putting a = fin (4.10), we obtain (1 — a) — x(a) S oV/2/a. 
This is, for symmetrical distribution, equivalent to the Tchebycheff-Bienaymé 


TABLE 1 


Upper bounds for the difference between the expected value of the sample median 
and the population mean measured by the population standard deviation 
{EX (n4ay/2 — EX}/e 


‘ ‘ . Upper Bound Given by 
S: as 3 ; “ee 
umple size n Least Upper Bound Ordinary Schwarz Inequality 


. 27099 44721 

.37659 65465 

.43918 . 79480 

9 -48291 . 90226 

11 -51604 . 99019 
13 -54221 1.0650 
15 . 06374 1.1305 
17 .58210 1.1888 
19 . 59776 1.2416 


inequality. The latter, in the general case, can be derived by applying the above 
formula after symmetrizing the given distribution. 


ExAMPLeE 4. For the expected value of the difference of two order statistics, 
(3.10) leads us to 


{ pl 1 
(4.1 1) E(X; 7 X;) S og i, 2(F)’ aF\?, 
0 ) 


where ¢(F) is to be obtained from 


' 


7 a n. yi—1 a \n-t 
WO) shi aoe QI F) 
(4.12) 

o> vice a Mie. amd 
j-nGa-n (I - 
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Thus, ifi = n,j = n — 1, then 
a(F) = o(F2) 
o(F) 


(4.13) 


where F; = (n — 2)/(n — 1). 
If7 =n —i+ 1, then 


—¢(Fs) Os F<1-F,, 
(4.14) ¢(F) 1—F. SF <F:, 
¢(F 2) F.<F <1, 


where F,; is determined by 


1 


¢(F) dF. L< Fe <1. 
2 


(4.15) (1 — F,)o(F:) = i 


F 


The case where i = n,j = 1 (i.e. the case of the ‘“‘sample range’’) is an exception, 
but it has essentially been discussed already (cf. [10] and [1]; F(X, — X,)/¢ 
is equal to twice the corresponding value in Table 1 of [1]). 
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A SIMPLE METHOD FOR IMPROVING SOME ESTIMATORS 


By Leo A. GoopmMan! 
University of Chicago 

1. Introduction and Summary. In the past, the principles which have been 
applied most often in the selection of an estimate are the principles of maximum 
likelihood and of minimum variance unbiased estimation. Recent statistical 
literature (e.g., {1]) has pointed out the fact that, while these principles are 
intuitively appealing, neither of them can be justified very well in a systematic 
development of statistics. Abraham Wald [2] has indicated a more systematic 
approach to the problem. One of Wald’s ideas may be paraphrased as follows. 
Consider a random variable X whose distribution depends on an unknown real 
parameter 6. If the value « of X is observed one makes an estimate, say f(x), 
and thereby incurs a loss of W[@, f(x)]. The risk associated with the estimate f is 
defined to be the expected loss R,;(@) = E{W[@, f(X)] | 6}. In choosing between 
two estimators f; and f2, it seems clear that one would prefer /; to fy if R;,(6) 
< R,,(@) for all values of 6, and R;,(8) < R,;,(@) for at least one value of 6. 

We shall consider only the case where the loss as a function of @ and the esti- 
mate f(x) is of the special form 


(1) W[@, f(x)] = vA(@)(f — 6°; (8) > O. 


Reasons for considering this form of W[@, f(x)] have been given in [3]. Suppose 


that we know of an unbiased estimate f whose variance is K6’, where K is known. 
Then, as we shall see, the risk of f is greater than the risk of f/ (AK + 1). Hence 
1} (K + 1) is to be preferred to f as an estimator of @. This result holds for any 
function \(@) > 0. Although f/(K + 1) is generally not unbiased in the usual 
sense, it is unbiased in a certain sense (cf. [4]). 

It is seen that a special case of this result is related to the problem of the 
estimation of the scale parameter of a population whose form is not given but 
for which the ratio of the first and second moments is known (cf. [5]). 

Some special cases and applications are discussed in detail. 


2. Results. Let Y be any real-valued random variable whose distribution de- 
pends on an unknown real parameter @ > 0. 

THeoreM 1. Suppose OE {Y | 0}/E{Y°|@| = A identically in 6, where A is 
known. Then among all statistics of the form aY, where a is a constant, the risk 


Ray(0) = E{nx(@)(aY — 6) | 6} 


is minimized for each value of 6 when a = A. 


teceived 9/29/52. 
! This report was prepared in connection with research supported by the Office of Naval 
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Proor. We have that 
Ray(0) = E{n(6)[a’Y* + 6 — 2aY6] | 6} 
A(6)[a°E{ Y? | 6} + 6 — 2aAE{Y" | 6}]. 


The risk is a quadratic function of a which is minimized when 


OR, Y (6) 
da 


that is, whena = A. Q.E.D. 
In this case, 


= 2d(6)[a — AJE{Y? | 6} = 0; 


Ray(8) = d(9)[@ — A*E{Y? | 63]. 
For the function which uniformly minimizes the expected loss, we have that 
E{AY | 6} = O[E{Y | oj) /E{Y*| 6} s 0. 


Hence, this function will be unbiased only when E{Y’| 6} = [E{Y | 6}]*; that 
is, when the variance of Y is zero. 

Following Lehmann [4], we say that an estimate Y is unbiased with respect 
to the loss function W[6, Y] = \(6)(Y — 6)? if for each 6, E{ W[6, Y] | 6} is mini- 
mized when 6 = @. 

TuroreM 2. Suppose OE{Y | 0}/E{Y*|6} = A identically in 6, where A is 
known. Then among all statistics of the form aY, where a is a constant, the only one 
which is unbiased with respect to the loss function W[6, Y], when \(6) = 6, is AY 
(which uniformly minimizes the risk). 

Proor. The expected loss function is a quadratic function of 1/6 and the 
minimum of this quadratic function may be computed as in the proof of Theorem 
1. We see that this function is minimized when 6 = @ if and only if a = A. 
Q.E.D. 

If Y is an unbiased estimate (in the usual sense) of 6, then, A = 6 /E{Y?| @!. 
The relative improvement in risk obtained by using AY is 


1 — Ray(6)/Ry(6) = 1+ (1 — AJ/l — 1/A] =1—-A. 


Since the variance of Y is [(1/A) — 1] = Ké we may write AY as Y/[K + 1] 
and the relative improvement in risk is AK. We have found that Y/[K + 1] is 
unbiased with respect to the loss function (1) when \(@) = @~ 

Let us now consider the special case where Y is a real-valued random variable 
whose distribution has the invariance property under a change of scale; that is, 
the probability function of Y is 6”'f(y/@), @ > 0, where the function f(y) is 
known, but the parameter @ which determines the scale of the distribution of Y 
is unknown. Then E{Y | 6} = M@and E{Y*| 6} = N@ where M = EY} 1} 
and N = E{Y*! 1}. Since the conditions of Theorems 1 and 2 hold, we see that 
among all invariant functions g(Y) (i.e., functions having the invariance property 
g(cY) = cg(Y) for all c > 0) there is one which uniformly minimizes the expected 
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loss. This function is YE{Y | 1}/E{Y¥*|1} = D. This fact may also be proved 
using a result due to Pitman ((5], p. 406) which deals with the case where one has 
a sample from the distribution @ ‘f(y | @). Pitman showed that the invariant 
estimator with the smallest mean square error is 


és [ vs(*) ao / [ v(7) dB. 


By a change of variables we see that C = D. Hence the estimate C may be com- 
puted even if the function f(y) is not given but the ratio of the first and second 
moments is known. In fact, even if the probability distribution of Y/@ is a func- 
tion of 6, we see that C is the invariant estimate of 6 with the smallest mean- 
squared error when the ratio of the mean and second moment of Y/@ is known. 
We also have that C is unbiased with respect to the loss function (1) when 
(6) = 6°. 


3. Applications. Suppose x; , %2, --* , %, is a sample of n from a normal dis- 
tribution where both the mean uw and variance o are unknown. Writing 
#= Dohia,/nands = De, (xi — %)*/(n — 1), we find that s°(n — 1)/(n + 1) 
is the invariant estimate of o° which minimizes the risk. Also, 


s[2/(n — 1)]'?P(n/2)/T((n — 1)/2) 


is the invariant estimate of ¢ which minimizes the risk. When yu is known, writing 
= 02, (a; — u)*/n, we find that t’n/(n + 2) is the invariant estimate of o° 
which minimizes the risk. Hodges and Lehmann ({1], p. 17) have shown that 
t’n/(n + 2) is the unique admissible minimax estimate of o” when \(@) = 1/6 
They point out the fact that ¢ is neither minimax nor admissible. We also find 
that ¢[2/n]'°P((n + 1)/2)/T(n/2) is the invariant estimate of ¢ which minimizes 
the risk. 

Suppose 2; , 22, --- , 2, isa sample of n from a uniform distribution on (0, p), 
where p is unknown. Writing y = max (2, 42, °-- , Xn), We see that y(n + 2)/ 
(n + 1) is the invariant estimate of p which minimizes the risk (ef. [4], p. 589). 

The preceding examples deal with random variables whose distribution have 
the invariance property under a change of scale. The conditions E{Y | 6} = Meé@ 
and E{Y* | 6} = N@ are weaker than the condition of invariance under change 
of scale. Hence, Theorems | and 2 are stronger than the corresponding invariance 
theorems. The following example satisfies the conditions of our theorems (‘“‘second 
order invariance’’), but the distribution of the random variable does not have 
the invariance property (in the usual sense) under change of scale. 

The distribution of the random variable Y depends on an unknown real 
parameter 6 which may be included in one of two disjoint sets 2, or 22. If @ is 
in 2, , then Y/@ has a Poisson distribution with a mean of 1 (variance also is 1). 
If 6 is in Q,, then )'/@ has an exponential distribution with a mean of 1 (vari- 
ance also is 1). Whether @ is included in Q, or in 2, is unknown, so that the 
distribution of Y/@ is a function of 6 and does not have the invariance property 
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under change of scale. We see that Y/2 is the invariant estimate of 6 which 
minimizes the risk and that the risk associated with the unbiased estimator is 
twice the risk associated with Y/2. 


Using Theorem 2, we find that all the invariant estimates described in this 
section which minimize the risk are also unbiased with respect to the loss func- 
tion (1) when \(@) = 6°. We also see that these invariant estimates C have the 
optimum properties of minimizing the risk and being unbiased with respect to 
the loss function (1) with \(@) = 6” even when the underlying distribution 
of the variates is not the one specified (normal, uniform, etc.) as long as 


E{C/0| 6} = E{C?/6" | 6}. 
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JOINT SAMPLING DISTRIBUTION OF THE MEAN AND 
STANDARD DEVIATION FOR PROBABILITY DENSITY 
FUNCTIONS OF DOUBLY INFINITE RANGE! 


By Metvin D. SPRINGER 


U.S. Naval Ordnance, Indianapolis 


1. Summary. The joint sampling distribution of # and S is derived in integral 
form for probability density functions of doubly infinite range. This derivation 
is effected through the use of a transformation which transforms the sample 
probability element f(x;)f(r2) --- f(tn) dx, dx. --- dz, into the element 


S(ar)f (ae) +++ f(an—2)f((né — _ x;y + %)/2)f((né — >I 2; F 2,)/2) | J 


-dxy dx» ose dx, 2 di ds, 


where @ = (1/n) Df'a;, S’ = (1/n) DP (x; — ®*, and J is the Jacobian 
of the transformation. Bounds on z,_,, r = 2, 3, --- , n — 1, are established 


in terms of %, S, and x,_,-;, 7 = 1,2, °+- ,n — r — 1. The probability ele- 
ment 


f(xy)f (a2) +++ f(tn-2)f((né — >i ay & D)/2)f((né 
— dor? a F 2)/2) | J | dx dxz +++ dan. dé dS 


must then be integrated with respect to z,_,, r = 2, 3, --:,n — 1, between 
these limits to obtain F(Z, S) dé dS, the joint probability element of Z and S. 
These limits of integration of z,_,, r = 2, 3, --- ,” — 1 enable one to express 
F(z, S) in terms of quadratures when f(x) is any probability density function 
of doubly infinite range. To illustrate the method, F(z, S) is obtained when 
f(x) is the normal probability density function. 


2. Introduction. It is well known that if random samples of n items are drawn 
from a parent population, # and S will be independent in the probability sense 
if and only if x is normally distributed in this population [1]. Furthermore, if 
the parent population is normal with mean m and standard deviation o, # and 
S are distributed jointly in accordance with 


n”?S"—? exp | 


— (n/207)|(% — m)? + S?}} 
9 ((n—2) /2) (n—!] 
yp (=!) 
») 
Received 6/2/51, revised 8/28/52. 


1 This paper was presented at a joint meeting of the Institute of Mathematical Statistics 
and the Biometric Society at Oak Ridge, Tennessee on March 17, 1951. The opinions ex- 
pressed herein are solely those of the author and are not necessarily those of the U.S. 
Navy Department. 


(1) F(z, S) = 
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This joint distribution for the normal function is often referred to as Helmert’s 
distribution, since it was first established by Helmert [2]. Helmert arrived at 
(1) through the use of a pair of linear transformations which transformed the 
joint distribution of the individual errors of observation into a joint distribution 
of sample mean error and standard deviation, plus dummy variables which 
were integrated out over all possible values. Kruskal [3] has shown that Hel- 
mert’s distribution may be obtained directly by mathematical induction. How- 
ever, when sampling is extended to nonnormal universes, little seems to be 
known about F(z, S) except for very small samples. Truksa [4] has expressed 
F(#, , S;) in integral form and has applied “the concept of the probability of 
passage” to obtain F(Z:42 , S42) from F(%, , S:), where , and S, represent, 
respectively, the mean and standard deviation of a sample of ¢ items and 
where F(z, , S;) is assumed to be known. A. T. Craig [5] has derived F(z, S) in 
integral form when n = 2, 3, 4 for probability density functions of doubly in- 
finite, singly infinite, and finite positive range. Yet, for no probability density 
function f(z) has F(z, S) ever been expressed explicitly in terms of quad- 
ratures for the general case of samples of size n. It is the purpose of this 
paper to derive F(z, S) in terms of quadratures for any sample size when f(x) 
is any probability density function of doubly infinite range. Whereas the pro- 
cedure in [3] and [4] is to add one or two new observations and express the new 
F and S in terms of the old, whose distribution is taken as known, I shall employ 
a transformation for fixed n and derive an integration formula, particularly the 
limits of integration, inductively. 


3. F(x, S) for probability density functions of doubly infinite range. Consider a 
universe characterized by the probability density function f(r), -» <2 < x. 
If n variates 7;, 7 = 1, 2, --- ,m, are selected at random from this universe 
the probability that they will fall simultaneously within the intervals dz; , i = 
1, 2, --+ , m, is given, to within infinitesimals of higher order, by 


f(a) f(x2) «++ f(tn) dx, dr, +++ dz. 


+: 2 2 =2 = ss ° 
Since nS’? = Dor ai — n# and né = oP 2; , we may eliminate z, in the first 
equation, obtaining 


n—1 


(2) >. ait ui — 2nFua-1 + n(n — 1) — nS’ = 0, 
1 


k “~ . . . . . 
where u, = 1 x; . Solving this (symmetric) equation for z,_; we have 
(3) = 4(nE — Une + 4), 


where 
TUn—r—1 oe ZIRDP ao 2 


— n(n — r — 1)2 + (r + 1)rnS’. 
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Thus we may employ the transformation T: 
y=2%, 
= 3(n= — Uno + 2), 
In = 3(nE — Uno + Q)). 
Application of this transformation to f(2)f(a2) --- f(an) dx: dz +--+ dz, gives 
S(ai)f(at2) «++ f(an) Gx, dxz +++ dxq 
(5) = f(x:)f(x2) +++ f(tn-2)f(4A(nE — Une & 21))f(R(NE — Un_e F M)) 
- | J | dx, dxz +--+ dxtn_2 dé dS, 


where | J | = | Jacobian of T | = n’S/Q, . Evaluation of the multiple integral 


Lo ff 12) Fee nile M900 — wes = 2) 


X f(R(nE — Un_g + 2,))2n*S/Q, dens dxn_-3 «++ dx 


over the range of the variables x; , 7 = 1, 2, ---: , mn — 2, yields the joint dis- 
tribution F(z, S). It will be shown presently that the limits of integration of 
Xn—r in (6) are (n= — Un_r1a + Q,) /(7 + 1), r = 2,3, ---,n — 1. Before estab- 
lishing these limits, let us consider an example. 


4. The normal distribution. To illustrate the method, we shall derive F(X, S) 
for the normal distribution with mean m and standard deviation o. This entails 
evaluating (6) for f(z) = (1/o(2m)') exp { —3(2 — m)?/o°}, the limits of integration 
of z,--,7 = 2,3, ---,m — 1, having been specified at the close of Section 3. 
Upon employing the relationship (which is easily verified) 


oO}, = —— o%,, — m(m + 2) = + meee (Un—m-2 — nz) \ 
m+ 2 \ m+ 2 ‘ 


/ 
and evaluating a few of the integrals in (6), it becomes evident that after r 
integrations we have 


F(z, 8) = 4n'S exp| —}(n/o°)[(Z — m)* + S']r*T) 


o"(2r)"/2(r + 1) &—-Y/2(r a ayrar(“) 


x I ee Jom i o8 eae dx» dx, , 


the limits of integration having already been stated. To establish (7) by mathe- 
matical induction, assume that (7) results after r integrations are performed in 
(6), where r is any integer from 1 through n — 3. Carrying out the next integra- 
tion, we obtain in a very straightforward manner (7) with r replaced by r + 1. 
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Thus, if (7) holds for r integrations, it necessarily holds for r + 1 integrations. 
But it is easily verified that (7) holds for r = 1; therefore, it holds when r = 
2,3, --- ,n — 2. Letting r = n — 2 in (7), we have the well known joint sam- 
pling distribution of and S for the normal universe, namely, (1). 


5. Limits of integration of the variables. It remains to prove that the variable 

In-r is restricted to the closed interval 
NE — Uni — 2D, NE — Un—r— Q, 
(9) ——————, et = Soret ), r=2,3,---,n-—1. 
r+ 1 r+1 

To accomplish this, we again resort to mathematical induction. To expedite 
matters further, let us agree that when 2%, , m = 1, 2, --- , n — 2, is involved 
in this discussion, it shall be regarded as a quadratic function of z,_-; . Bearing 
this in mind, we note that the discriminant of 92, is 23,,, . We note further that 
since x,_; is necessarily real, the inequality 


(10) Qi = 0 


must be satisfied. Clearly, a necessary and sufficient condition that (10) be 
satisfied is that the discriminant of Q] be nonnegative. That is, for a given z 
and S, thez;, 7 = 1, 2,--- , — 3, must collectively satisfy the condition 


(11) 2; = 0, 


in which case condition (10) will be fulfilled if and only if 


NZ — Un-3 — § NE — Un— { 
wt — Uns ° Me, is E — Uns + QM 


3 28 > 


Similarly, condition (11) is met if and only if 2; = 0. This restricts 2; , 7 = 
1, 2, --- ,m — 4, to values which collectively satisfy 23; 2 0, in which case 


ni — “a4 Qs3 <-z “ ni — Un—4 os Qs 
ee ee ee => —3 = ee 
4 4 


In general, since the discriminant of Q7_; is Q; , if 2, , t2, -*+ , nr Satisfy the 
condition 


(12) o.,2 0, r=2,3,---,n 


1,22, °** , {n-r-1 Must necessarily fulfill, collectively, the condition Q; 
whence 


NE — Unrsi — Q < < NE — Un-r-i + Q, 


so r+ 1 
That is, if condition (12) obtains when r = p, it necessarily holds when r = 
p + 1. But condition (12) must hold when r = 2; therefore, it must hold when 
r = 3,4, ---,n — 1. This confines z,_, , 2,3, ---,n — 1, to the closed 
interval (9). Finally, since 2.) = n’(n — : 0, all the intervals (9) exist 
in the real domain. 
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Although the joint sampling distribution of # and S is given by (6) for any 
probability density function of doubly infinite range, it may be necessary to 
resort to numerical integration or other approximate methods to evaluate the 
multiple integral (6) when n > 3. 

The distributions of # and S taken singly are, of course, 


(13) g(2) = [ F(z, S) dS, 


and 
(14) n(S) - | F(z, S) di. 


The question naturally arises as to whether a similar method may not be 
used to determine the joint sampling distribution of # and S for probability 
density functions of singly infinite range. Actually, the procedure here described 
may be modified, particularly with respect to the limits of integration of (6), 
to obtain F(z, S) for probability density functions of singly infinite range. This 
modification, necessitated by the restriction of z;, 7 = 1, 2,--- , n, to non- 
negative values, considerably complicates the derivation of F(Z, S). Since the 
results are quite lengthy they will not be presented here, but will be discussed 
in detail in a later paper. 
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NOTES 
A NOTE ON INCOMPLETE BLOCK DESIGNS WITH ROW BALANCE* 


By H. O. Hart ey, 8. S. SHrrkKHANDE, AND W. B. Taytor 
University College, London, University of Kansas, and University College, London 


1. Introduction and summary. With the balanced incomplete block designs 
[1] r replicates of each of v treatments are arranged in b ‘blocks’, and the number 
of ‘plots’, k, in each block is smaller than v. In order to eliminate systematic 
block differences from the comparison of treatment means and to obtain treat- 
ment comparisons of equal precision the well known conditions of balance 
specify 

(a) that no treatment should occur more than once in any block, 

(b) that the number of blocks in which any two particular treatments are both 
applied should be a constant number of blocks (A blocks) for all possi- 
ble treatment pairs. 

When these designs are used in this form, the position of treatments within 
each block is not specified and can normally be regarded as unimportant. Ac- 
cordingly the treatments in each block are randomised. Situations, however, 
arise in which the ‘plots’ occupy certain characteristic positions in each block. 
Thus, if in a field experiment the blocks are vertical columns the plots would 
fall into k horizontal rows which may also have systematic effects on the yields. 
In this case it will often be advantageous to balance the design with regard to 
rows (as well as with regard to columns) in a manner similar to the Latin square. 
Such an arrangement was first developed by Youden [2] who used the particular 
incomplete block designs with b = v and in these rearranged the treatments in 
each block in such a way that every treatment occurred precisely once in each 
row. More recently one of us (8. 8. 8S.) has carried out similar rearrangements for 
the other incomplete block designs with b > v, v S$ 10, r S 10 (i.e., for those 
tabulated in standard tables and books), and has used a definition of balance re- 
sulting in at most two different precisions for treatment comparisons. In this 
note we show that 

(i) balancing with regard to rows resulting in an equal precision of ail treat- 
ment comparisons is possible if! ) = mv (m integral), 

(ii) in all incomplete block designs with r = mk + 1 a row balance is possible 
resulting in treatment comparisons of two different precisions. 

One of us (W. B. T.) has prepared complete tables of double balanced designs 
suitable for practical use which it is hoped to publish together with the analysis 
of variance procedure with recovery of interblock information. 

2. Notations and preliminaries. We start from a balanced incomplete block 
design with parameters v, b, r, k and X. It followsfrom (a) and (b) that each treat- 

* Received 2/19/52. 

1 A theorem bearing on the necessity of this condition is given in Section 4. 
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ment occurs in just r blocks and that the integers v, b, r, k and \ must satisfy the 
following conditions. 


2.1) vr = bk, Aw — 1) = r(k — 1), b2v. 


The last inequality, which is not so obvious, is due to Fisher [4]. Let the design 
be written in row column form where the columns correspond to blocks and the 
rows to the positions of treatments in the blocks. Let n;; be the number of times 
treatment 7 occurs in row j in the b columns (7 = 1, 2, --- v;7 = 1, 2,--- k). 
It is shown in [3] that if 


(2.2) Leni; =», 
J 
(2.3) Dd nj mu; = ws, ix~u 
3 


then all treatment comparisons are made with equal accuracy. If, however, the 
design satisfies (2.2) and the condition 


(2.4) >> Ni; Mui = Me, e = 1,2, 
J 


where treatments 7 and wu are e-associates as defined by Bose and Nair [5], then 
it has been shown there that there are two accuracies, that is, some pairs of 
treatments are compared with one accuracy while other pairs are compared with 
a different accuracy. The row balancing of all the designs obtained in the above 
paper [3] satisfies the conditions of partially balanced incomplete block designs 
as defined by Bose and Nair as well as that of a particular class of group divisible 
designs given by Nair and Rao [6]. In this particular class of incomplete designs 
the v treatments can be divided into c groups of size d each so that (2.2) holds 
and further 


(2.5) x NijNuj = wa OF po 
3 


according as treatments 7 and u belong to the same group or to different groups. 
It is easy to verify that such a group divisible design is always a partially balanced 
incomplete block design. 

In the rest of the paper it is proved that if m is a positive integer then a bal- 
anced incomplete block design with b = mv can be converted into a design 
satisfying (2.2) and (2.3) while a design with r = mk + 1 can be converted into 
a design with (2.2) and (2.5). These results follow from a general lemma due to 
Smith and Hartley [7] and which may be stated as follows: 

Lema. If we are given any set of bk elements made up of b “‘treatments’’ each 
replicated k times, the set being arranged in a two-way classification of k rows and 
b columns, then it is always possible to rearrange the elements in each column so that 
each row will contain every treatment once and only once. 


3. The case b = mv. Consider a balanced incomplete block design with param- 
eters v, b, r, k and A}, where b = mv and hence r = mk. Let it be written in row 
column form with k rows and b columns. Since r = mk we can split the mk 
replications of each treatment into k replications of m pseudo-treatments. We 
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then have a two-way classification of k rows and b columns in which each of the 
b (= mv) pseudo-treatments are replicated k times. Applying the lemma above 
each of the 6 pseudo-treatments can be made to occur once and only once in 
every row. Hence each of the original v treatments can be made to occur m times 
in every row. We then get >.;n3; = >. niu; = km’ for all i and u. Thus 
(2.2) and (2.3) are satisfied and, therefore, the converted design can be used for 
two-way elimination with the same accuracy for all treatment comparisons. 


4. The case mv < b < (m + 1)uv. In this case it is obviously impossible to 
specify that each treatment should occur the same number of times in every 
row since this would imply that r is a multiple of k and hence b is the same 
multiple of v. All one can hope for is that one may rearrange the treatments so 
that 

ConpiTion A. Each treatment occurs either m or m + 1 times in every row. that 
is, Ni; = mor m + 1 for all i and j. Condition A implies that after rearrange- 
ment each treatment will occur m + 1 times in some r — km rows and m times 
in the remaining ones. Hence condition (2.2) is satisfied with »y = 2rm + r — 
km? — km. 

One may inquire whether it is possible for a balanced incomplete block design 
with mv < b < (m + 1)v to satisfy not only A but also condition (2.3), that is, 
>; niu; = uw for all i and u. If the design could be made to satisfy both these 
conditions then it could be used for two-way elimination of heterogeneity with 
the same accuracy for all treatment comparisons. 

We now show that both these conditions cannot be simultaneously satisfied. 

Suppose there actually exists a balanced incomplete block design with mv < 
b < (m + 1)v satisfying both the conditions. Because of A there are in each row 
precisely k’ = b — mv treatments occurring m + 1 times and v — k’ treatments 
occurring m times in that row. Take one replicate from each of the former only, 
leaving out the remaining mv replicates of the v treatments. These form k ‘‘re- 
duced” rows each of size k’ in which each of the v treatments occur r’ (= r — mk) 
times. From mv < b < (m + 1)v it follows that r = km + 1 or?’ = 1. Further 
vr’ = kk’ and v > k imply that k’ > r’ so that k’ = 2. The condition (2.3) then 
implies that for these reduced rows 


5 
D ijNujy = nw = w — km* — 2m(r — km), i#u=1,2,---», 


j=l 


where uw’ necessarily integral is not less than 1. Since for these reduced rows 
ni; = 0 or 1, it follows that every pair of treatments 7 and j will occur exactly 
in uw’ of these reduced rows. Hence the array of k reduced rows would specify a 
balanced incomplete block design with parameters v’(= v), b’(= k), 1’, k’ and 
u’. But this is impossible from (2.1) since b’ < v’. 

We will now show that it is possible to convert balanced incomplete block 
designs with r = mk + 1 so that after suitable interchanges of treatments in 
various columns the v treatments are divided into c groups of d treatments each 
satisfying condition (2.2) and (2.5). 
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5. The case r = mk + 1. Since r = mk + 1 and bk = or it follows that b = 
mv + t where ¢ = v/k is an integer. We now split the r replications of each treat- 
ment into m pseudo-treatments with k replications leaving one ‘“‘odd”’ replica- 
tion. The v ‘‘odd” replications can be considered as ¢ further pseudo-treatments 
with k replications each. We now apply the lemma to this arrangement of b 
pseudo-treatments so that each pseudo-treatment occurs just once in every row. 
This implies that the original v treatments are divided into c = k groups of d = t 
each so that in each row all the treatments of only one group occur m + 1 times 
each, while the remaining ones occur m times. Further, treatments of any group 
occur m + 1 times each in only one row. It is easily verified that (2.3) and (2.5) 
are satisfied with » = yw, = mk + 2m + 1 and po = mk + 2m. 


6. The case r= mk — 1,m2 2. In this case b = mv — t where t = v/k is 
an integer. We now split the r replication of each treatment into m — 1 (21) 
pseudo-treatments with */ replications each leaving k — 1 “odd” replications. 
Now add ¢ dummy blocks of size k containing each of the v treatments precisely 
once. The v treatments may be arranged in any way in these dummy blocks. 
The k — 1 ‘‘odd” replications of any treatment together with the replication of 
the same treatment in the dummy blocks can be considered as one more pseudo- 
treatment withk replications. Thus in all we have mv pseudo-treatments occurring 
k times in the mv blocks. We now apply the lemma to these mv blocks. After the 
lemma’s rearrangement we consider only the b orig'nal blocks, and find that each 
of the v original treatments occurs either m — 1 or m times in every row. We now 
divide the v treatments into k groups of size ¢ by placing those treatments, which 


occur m — 1 times in a particular row, in one group. It is easily seen that (2.2) 
and (2.3) are then satisfied with » = yu, = km” — 2m + land w = km? —2m. 

Finally it should be noted that the interchanges discussed in Sections 3-6 do 
not require that every pair of treatments should occur in the same number of 
columns. It is sufficient that every treatment occurs the same number of times 
in the b columns. 
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SEQUENTIAL TEST 


A LOWER BOUND FOR THE AVERAGE SAMPLE NUMBER 
OF A SEQUENTIAL TEST! 


By WassiLty HoEFFDING 


University of North Carolina 


Summary. A lower bound is derived for the expected number of observations 
required by an arbitrary sequential test which satisfies conventional conditions 
regarding the probabilities of erroneous decisions. 


1. Statement of results. Let X,, X2, --- be a sequence of independent ran- 
dom variables with a common frequency function f(z, 6) (either a probability 
density or the elementary prebability law of a discrete distribution), where the 
parameter @ is confined to a set 2. Let S be an arbitrary (possibly randomized) 
sequential test for deciding between two alternatives Hy and H, which fulfills 
the following requirement. Given two disjoint subsets wo and w, of 2 and two 
numbers a, 8 between 0 and 1, S satisfies the inequalities 


P,(S accepts H,) S a if @ € wo 
(1 wore 
P,(S accepts Hy) S 8 if Oe uy, 
where P(E) denotes the probability of the event E when the common fre- 
quency function of the X; is f(x, 6). 
It will also be assumed that 


(2) P.(S accepts Ho) + Pe(S accepts H;) = 1 for all 0 €Q. 


Let n be the number of observations required to terminate the test S (by ac- 
cepting H, or H,). It will be shown that if conditions (1) and (2) are satisfied and 


(3) a+ss 1, 
we have 
(4) ———__~ - 
» (15 SX.) ne FL, ee 
7 OF (loz fx, 5+) + (1 c)E¢ (lo f(x, 5) 
for every c, 6 and 6, such that 


(5) Oc < & Oo € wo, 0; € a. 


Here X denotes a random variable with the same distribution as the Y;, and 
E,(U) is the expected value of U (a function of XY, X,, X2,--- ) when the 
common frequency function is f(x, 6). 

The expected values in the denominator in (4) always exist and are nonnega- 
tive, possibly + ~ (as can be seen by applying the inequality log z = 1 — 2° 

Received 7/14/52. 

1 Work done with the support of the Office of Naval] Research. 
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if x > 0). The numerator in (4) is nonnegative and vanishes if and only 
ifa+pf=1. 
The best inequality that can be obtained from (4) is evidently 


c l—e \e nl—e 
(6) his ap RE FS ee 


* saoek ce(0) + (1 — cer (0) 


’ 


where 


If 6 € w, then e,(6) = 0, and the ratio in (6) can be written as 


net —9(-2,) +0(¢ SVT 


The expression following log is an increasing function of c. (For a proof see, e.g., 
Uspensky [1], particularly p. 267.) Letting c — 0, we obtain 
8 log B + (1 ~ ite * = 
(8) E,(n) = — cance erdmpigconte 
\ 46 = €o(8) 
If 6 € w , inequality (6) reduces in a similar way to 
a log j a (1 — a) log as 


(9) E,(n) = if 8 € wo. 





e:(8) 


Inequalities (8) and (9) were obtained by Wald ([2], Section 4.7) under the 
assumption that the sets w) and w, consist of one point each and the signs of 
equality hold in (1). 

The sign of equality can hold in (6) only in certain special cases. If a + 6B = 1, 
conditions (1) can be satisfied without taking any observations, and hence 
E,(n) can attain the lower bound 0. In (8) and (9) equality can be attained by a 
sequential probability ratio test for certain special distributions f(r, 6) and 
suitable values @ (cf. Wald [2]). In general the greatest lower bound for E4(n) 
is likely to be a complicated expression. The bounds derived here, although in 
general not the best ones, have the advantage of being simple. 

The greatest lower bound of E,(n) can not exceed the least sample size 
N = N(a, 8) of the best nonsequential test which satisfies (1). The following 
example may serve to compare the bound in (6) with N(a, 8). Let 


1 


ee) 2 
2n 


f(z, 0) = : wo = {0 —$}, a = i) - 6}. 
Then 
(0) = 0 if@<s —s, e(0)=(8+5)/2 if@> —s, 
e(0) = (@— 8) /2if6<56, ¢(6) =0 if 6 = 6. 
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Suppose that a = 8, and let 6 = 0. Then the supremum in (6) is attained for 
c = 4, and we obtain Ey(n) = —logl4a(1 — a)|/8 = M, say. (It can be shown 
that M is the maximum with respect to @ of the bound in (6).) 

The best nonsequential test which satisfies (1) with a = 8 accepts Hy or H, 
according as >>) X; is negative or positive, where 


7 ” 1 - —(72)/2 
N=-5, = | w" dt = a. 
6° V 29 Jy , . 


(We assume for simplicity that \/8° is an integer.) Hence 
M/N = —log[4a(1 — a)]/d’, 


a function of a only which varies between 3 (for a — 0) and 2/x (for a — 3). 


2. Proof of inequality (4). The proof of (4) will be based on the inequality 


f(X, 0) L(@) 1 — L(@) 
F(X, 6) 16’) + 1! — L@1 log Fen: 


6, 8 € Q, 


Eo(n) Es (ox 


) > L(@) log 


(10) 


where 
(11) L(@) = P.(S accepts Ho). 


Inequality (10) is due to Wald [2] and is true for every test S which satisfies 
(2). In Wald’s proof the test S is assumed to be nonrandomized (in particular, 
a first observation is always taken, so that n 2 1), but it is easy to extend the 
proof to randomized tests. 

To prove (4), put in (10) 6’ = 6 and multiply both sides with c; then put 
6’ = 6, and multiply both sides with 1 — c. Addition of the corresponding sides 
of the two resulting inequalities gives 


‘“ f(X, 0) ai f(X, 0) )| 
E4(n) | et (10g por +(1—cEs (10g HX, 6) 


> L(@) log L(@) + [1 — L(@)] log [1 — L(@)] — rL(@) — sll — LO) 
= H(L(6)), 


(12) 


say, where 
r = c log L(@) + (1 — c) log L(4), 
s = clog {1 — L(@)| + (1 — c) log [1 — L(@,)). 


The minimum of the function H(u) is attained at u = uw , where uw = e'/(e’+ e’), 
and we find 


(13) H(u) 2 H(w) = —log K(1 — L(@), L(6)), 
where 


(14) K(x, y) = 2°(1 — y)"* + (1 — 2)'y*™. 
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The function K(z, y) is an increasing function of x and an increasing function 
of y, provided x + y < 1. Conditions (1) and (2) imply that 1 — L(@) S a, 
L(@:) Ss 8. Hence if a + B < 1, we have 


(15) K(1 — L(6), L(@:)) S K(a, 8). 


Inequality (4) now follows from relations (12) to (15). 
Concerning the conditions for equality, it suffices to observe that in (10) the 
sign of equality holds if and only if there exist constants Cy and C; such that 
‘Tr f(X;, 8 
P 1 = 
wi f(X;, 0) 
where the usual notation for conditional probabilities is used. This can be veri- 
fied from Wald’s proof. The conditions for equality in (12), (13), (15) are obvious. 


C; | S accepts HA = 1, t= 0,1, 


/ 
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SOME INEQUALITIES ON MILL’S RATIO AND RELATED 
FUNCTIONS 


By M. R. Samprorp 


University of Oxford 


1. Introduction. Mill’s ratio is defined as 


(1) R. = oe | e™* du. 


Gordon [1] and Birnbaum [2] have given, respectively, upper and lower limits 
for R, as 


(2) {VW44+ 22 -— 2} < R, < 1/z, x>0. 


Murty [3] has shown how limits to R, of any required degree of accuracy can 
be derived for x > 0 by the use of successive convergents of Laplace’s expres- 
sion for the normal integral as a continued fraction. No limits have, as yet, been 
published for « < 0. 

If the functions v(x) and A(x) are defined by v(x) = 1/R., Mx) = v(x) = 
v(v — x), the inequalities 


(3) e<i<ct 
(4) ’ = vi(v — z)(2v» — zx) — 1} > O 


Received 5/15/52, revised 9/16/52. 
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are of importance in the theory of the analysis of response-times and of trun- 
cated normal data generally. The result (4) was conjectured as true for positive 
x by Birnbaum [4] and was proved for sufficiently large positive x by Murty [3]. 
In this paper it is proved for all finite z, and is used to provide an upper limit 
for R, valid over the range x > — 1. The result (3) is also proved for all finite 
x. The upper limit is equivalent to the lower limit in (2), which is thus valid 
for x < 0 as weil as for positive z. 


2. Proof of the inequality on \. The function 


om / | e™ du 


is a p.d.f. over the range x S u < , and its variance is easily shown to be 
1 — v(v — x). Since this must be positive for finite z, the upper limit in (3) 
follows. Also v > O by definition. Hence (v — x) > 0 for xz S O and, by (2), 
for x > 0, and the lower limit follows. 


3. Proof of the inequality on \’. The result (4) is equivalent to 


(5) ¢ = (vy — z)(2y — x) > 1. 


An expansion by parts for x > 0 gives 


—jz3 
é 


: ec du = {1 - : + r} ; 
z = } & 


where R = 0(1/zx’); whence 


1 + 2°0(1/2?) 

(6) = / 1 el as £—>~@, 
{1-=+ ri 
(2 


Also, asx — — x, v— 0 and 

(7) gor @, 
Now suppose there exists a finite 2; such that 
(8) g(a) = 1. 


Then, since ¢ is continuous and differentiable, (6), (7), and (8) imply the ex- 
istence of a finite point x2 for which 


¢(X2) i. 
¢’ (x2) 0. 


(9) 


But ¢’ = (A — 1)\(2v — x) + (vy — 2)(2A — 1) = ole — 1) + 2(v — e\(A — 1), 
whence, for finite z, ¢’ < v(g — 1), so that conditions (9) cannot be satisfied 
simultaneously, and the result (4) follows. 
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The quadratic ¢ = 1 has solutions 


ya tt V2 + 8 


(10) 4 


As v is known to be greater than z, only the positive sign in (10) need be con- 
sidered. The result so obtained is everywhere greater than x, and positive for 
allz > —1, giving the result 


R. < 4/{32 + V8 + 2}, r>-l. 


4. A corollary on the weight function in probit analysis. The function 


¥(z) = a / | et? du | edu 


is well known as the weight function in probit analysis. From tables it is obvious 
that y is a decreasing function of z*. Hammersley [5] has given a rather com- 
plicated proof of this result, and has remarked on the apparent lack of a simple 
proof. In fact 


W(x) = ¥(x){ r(x) — v(—z) — 22} 
= 2ry(x){A(x’) — 1}, where — | x | 


by the Mean Value Theorem, and, since y is positive by definition, the result 
follows immediately from (3) above. 
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ON A DOUBLE INEQUALITY OF THE NORMAL DISTRIBUTION! 


By Rosert F. Tare 
University of California 


In this note we shall extend certain results of R. D. Gordon and Z. W. Birn- 
baum concerning bounds for the normal distribution function. 


Received 1/7/52, revised 8/23/52. 
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DOUBLE INEQUALITY 


Gordon [1] obtained the inequalities 


Zz 1 oc 1 —}22 


ae 
eevee sf yy ass yy tor 2>0. 
Birnbaum [2] improved Gordon’s lower bound, obtaining the inequality 


1 
sfs Vin ee?" dt for x2 0. 


ow 


It was pointed out by Feller [3] that 


pe es ] i 1 1 1-3 1-3- ---(2k — 1) 
4e2 4x2 k 
L ya* amg {bt et er eh, 


where for z > 0 the right side is an upper bound when k is even and a lower 
bound when k is odd. It is evident that Feller’s expression does not constitute an 
improvement of the bounds of Gordon and Birnbaum when 0 < x < 1. The 
following theorem gives new bounds for 


[Vim &?* at 


‘THEOREM: 


i coe yet, tia te 
om te - des da > 
4 ae is Ya* *335* - (3 + s) for 2 0 


et 


=—st\ 4 
—412 1 _é 
+ -So+(1+5) s l se da < i+(3 ) for z<0. 


For the case x = 0 the lower bound exceeds that of Birnbaum for some z and 
is exceeded by it for other values of z. The upper bound is an improvement on 
the result of Birnbaum and Gordon for all xz. The inequalities for x S 0 are of 
course obtainable pera: from the relation 


Pe 
si A ins 40 
[ az e- * dt = ] [ V/2n é dt. 
The proof of the theorem will consist in proving two lemmas and then com- 
bining the results. In what follows we shall use the notation 


aes I —4jz? iF )= A 1 —41? 1 
f(x) = Van’ anc (x) = [rR dt. 
Lemma 1. (2F — 1)f 2 Fl — F)rfor0 Sx < & with equality atO and ~. 
Proor. Let g = (2F — 1)f — F(1 — F)z. Then, 


(1) g =2°-—F(Il—F), g” = —42f' + (2F — 1. 


It may easily be shc ‘n that g is continuous with derivatives of all order, g(0) 
= g(x) = 0, and g’(0) > 0. From this we see that unless g is nonnegative for 
all positive x, there exists a minimum 2» for which g(2) < 0. Now, from (1) and 
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the definition of g, we have g” < F(1 — F)x — 4aof? < —2af*° < 0, which is 

impossible. Hence, g is nonnegative for all positive x, which completes the proof. 
Lemma 2. F(1 — F) = xf 2 for0 S x < ~ with equality atO and =. 
Proor. Let h = F(1 — F) — xf’/2. Then, 


(2) h’' =f(1 — 2F) + e2f’, h” =f'(e — 2 — 2n2x") — af(1 — 2F). 


It may be shown that A is continuous with derivatives of all order, h(O) 
= h(x«) = 0,h'(0) = 0, and h”(0) > O. Let yo be an extremum of h. Then, from 
(2) h” = f(r — 2 — yo) at the point yo. Hence, yo S (x — 2) V2 if yo isa 
minimum and yy = (# — 2)'/+/2 is yo is a maximum, so that if a minimum and 
a maximum both exist, the minimum must precede the maximum. In view of this 
circumstance it is evident from the above mentioned properties of h, h’ and h” 
that a minimum cannot exist, and therefore that h is nonnegative for all 
positive x. 
The results of Lemmas | and 2 can be rewritten respectively as 


(3) (F + f 5 ) 2 (! = 4) + f ’ 
& o aa “ at 


(4) =f. 


| os 
{ 


For x 2 0 the upper bound of the theorem is obtainable from (3) and the 
lower bound from (4). 
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CORRECTION TO “SOME NONPARAMETRIC TESTS OF 


WHETHER THE LARGEST OBSERVATIONS OF A SET 
ARE TOO LARGE OR TOO SMALL’’* 


By Joun E. Watsu 
U.S. Naval Ordnance Test Station, China Lake 


This note calls attention to the fact that Theorem 4 of this paper (Annals 
of Math. Stat., Vol. 21 (1950), pp. 583-592) is only partially correct. The results 
limg_.P?i1(®) = 0 and lims._.P;(@) = 1 as well as the monotonicity properties 


* Received 1/29/52, revised form 9/19/52. 
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are not necessarily satisfied on the basis of the conditions stated in the theorem. 
The error arose from an incorrect and unstated assumption which was used in 
the derivations. This incorrect assumption was that 


a(n) — 6,---,an+1—r)— 6,2(n—r) —¢,---,2(1l) -—¢ 


represent a set of statistically independent observations. 

Test 3 of this paper can be interpreted as a method of deciding whether the 
largest observations are too small or as a test of whether the smallest observations 
are too small. An unpublished analysis shows that only the latter interpretation 
is of practical interest. Similarly, the appropriate interpretation for Test 1 is as 
a method of deciding whether the largest observations are too large. With these 
interpretations, both tests are of the ‘‘outlying observation” type. The unpub- 
lished analysis shows that Tests 1 and 3 are consistent under conditions much 
more general than those considered in Theorem 4 if these interpretations are 
adopted. Copies of this analysis can be obtained by writing the author atthe 
U. S. Naval Ordnance Test Station, China Lake, California. One place where 
Tests 1 and 3 may have practical value is where differences of paired observa- 


tions are being considered. Then the symmetry assumption often can be 
accepted. 


EE 


CORRECTION TO “ON THE STRUCTURE OF BALANCED 
INCOMPLETE BLOCK DESIGNS’’* 
By W. 8. Connor 
National Bureau of Standards 
In the paper under the above title (Annals of Math. Stat., Vol. 23 (1952), 


pp. 57-71) the number of blocks of type 1 given in Lemma 4.2 should be 


(k — y)\(k —y +1) + ky — 3k(k + 3) + 1. I am indebted to Dr. W. H. Clat- 
worthy for bringing this error to my attention. 


—————— 


CORRECTION TO “ON A TEST FOR HOMOGENEITY AND 
EXTREME VALUES” 


By D. A. DaRLInG 
University of Michigan and Columbia University 


In reference to the above paper (Annals of Math. Stat., Vol. 23 (1952) pp. 450- 
456) Professor Herbert Solomon has kindly pointed out an ambiguity in the last 
paragraph of Section 2. It is stated there that the table of reference [9] “appears 


5/52. 
52. 


* Received 9/1 
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to be subject to certain indeterminate errors.’’ This remark was intended simply 
to call attention to the discussion of the accuracy of the table made by the au- 
thors of [9] and should not discourage the use of the table since the present writer 
was merely referring to the evaluation of accuracy made by the authors them- 
selves. 

Meanwhile Dr. Churchill Eisenhart has pointed out that an exact table for 
the case of two degrees of freedom (r = 0) has been added to reference [1] as 
printed in Contributions to Mathematical Statistics by R. A. Fisher (John Wiley 
and Sons, 1950). 


Cs 


ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Chicago meeting of the Institute, 
December 27-30, 1952) 


1. A Two-Sample Multiple Decision Procedure for Ranking Means of Normal 
Populations with Unknown Variances. (Preliminary Report.) Roperr E. 
BECHHOFER, CHARLES W. DuNNETT AND MiLtTon SoBe, Cornell Uni- 
versity. 


The multiple decision problem of ranking k normal populations according to their popu- 
lation means when both the means and the variances are unknown is considered. Useful 
confidence statements, which are independent of the unknown variances, can be made 
concerning the rankings if a generalized Stein two-sample procedure is used. The rankings 
are of the general type formulated by Bechhofer (Ann. Math. Stat., Vol. 23 (1952), p. 139), 
which includes selecting the population with the largest population mean and ranking 
completely all k populations. Two cases are considered: A) population variances unknown 
and equal, and B) population variances unknown and not necessarily equal. To determine 
the size (which may be zero) of the second sample, tables for a (k — 1)-variate analogue of 
Student’s t-distribution are required for case A, and tables for a (k — 1)-variate analogue 
of the distribution of the difference between two independent Student t-statistics are re- 
quired for case B. For k = 2, tables due to Fisher and Sukhatme are available for cases 
A and B, respectively. For case A, k = 3, tables are being computed. For case A, k > 3 and 
for case B, k = 3, computation of the necessary tables is contemplated. (This research was 
sponsored by Air Research and Development Command.) 


2. A Sequential Multiple Decision Procedure for Ranking Means of Normal 
Populations with Known Variances. (Preliminary Report.) Roserr E. 
BECHHOFER AND MILTON SoBEL, Cornell University. 


Let X;; be normally and independently distributed N(X,; | ui, o?) from population 
m (i =1,---,k;j = 1,2,--- ). The uw; are unknown; o? is known. Denote the ranked y; 
by un) < --: < wx . It is desired to choose the population with mean yx; . Denote the 
sample sum based on m observations from 7; by Qin = U7. i; ; denote the ranked Q;,, by 
Rim <-*> < Rim. Define ym = Rim — Ri-tim (m = 1, 2,°°+) and bj = wry 
— way (§ = 1,--- , k — 1). Let 6* > O be the smallest value of 6;,,_; that it is desired to 
detect, and let 1 — 8(1/k < 1 — 8 < 1) be the desired probability of a correct choice (p.c.c.) 
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when 4;,4-1 2 6*. Let C = (k — 1)(1 — 8) /8. The sequential procedure Sp given below, where 
D = (07/5*) log.C, has the property that: p.c.c. is 21 — 8 when 4.x; = 5* with equality 
holding when é;,; = 6* (¢ = 1,2, --- ,k — 1). The procedure Sp is: ‘‘Observe the k-tuple 
(tu, +++ , 2m) and compute y . If y, 2 D, cease taking observations and choose the popu- 
lation associated with Ry as the one having mean yy. If y: < D, observe the k-tuple 
(a, , -** , 22x) and compute y2 . Proceed as above for m = 2,3, --- until observation-taking 
ceases and a population is chosen.’’ When Sp is used, the g.l.b. of the p.c.c. (neglecting, as 
in Wald-sequential, the excess over the boundary) is given by L(é) = C#/** + {C8/8* + 
(k — 1)} for all 5,,; for which 6 = Min &.; (¢ = 1, --- ,k — 1). Truncation of Sp also has 
been studied. Generalizations of Sp solve a large class of ranking problems. (Research spon- 
sored by Air Research and Development Command.) 


3. On Quadratic Estimates of Variance Components. FRANKLIN GRAYBILL, 
Oklahoma A. & M. College and Iowa State College. 


A common method of estimating variance components is ‘‘the analysis of variance’’ 
method where the estimate is obtained by equating expected mean squares to observed 
mean squares in an analysis of variance table and solving the resulting equations for the 
desired variance components. For any infinite population let the general hierarchal classi- 
fication model be given by y;;...p = w + ay + bij + +++ + e€:;...p Where uw is a constant and 
a; , bij , +--+ , €sj...p are independent random variables with means zero, variances o; . 
Oey" o5 respectively, and with finite fourth moments. Let ¢; be “the analysis of variance 
estimate”’ for o; . It is shown that the quadratic estimate of D7_, g,ot (gx known) which 
is unbiased, independent of », and has minimum variance is given by Df-1 9,9; . That is 
to say, the best unbiased quadratic estimate of a linear combination of variance compo- 
nents is given by the same linear combination of ‘“‘the analysis of variance estimates”’ of 
the individual variance components. 


4. Topics in Analysis of Variance: A. Optimum Properties of Tests for Model 
II, B. Generalizations of Model II. (Preliminary Report.) Leon Herpacu, 
Brooklyn College and Columbia University. 


A canonical form analogous to the one used for the general linear hypothesis is developed 
for Model II analysis of variance for one-way classifications and balanced (in the sense of 
Crump, Biometrics, Vol 7 (1951), pp. 1-16), multiple classifications. From this it is shown 
that 1) all exact F-tests used in testing hypotheses based on balanced multiple classifiea- 
tions determine uniformly most powerful (u. m. p.) similar regions although they are not 
always likelihood ratio tests; 2) in the case of one-way classifications the F-test is au. m. p. 
invariant test; if the classification is unbalanced there exists no u. m. p. similar region; if 
it is balanced the F-test is a likelihood ratio test. Two new models are considered which 
do away with the objections usually made to Model ITI, namely that 1) in multiple classi- 
fications the interactions must be distributed independently of the main effects and 2) one 
cannot account for a larger frequency of negative estimates of variance components than 
can be explained by sampling errors. These models have as special cases the usual Model 
II. (Work partially sponsored by the Office of Naval Research.) 


5. Further Examples for which the Likelihood Ratio Test is Poor. L. M. Court, 
Brooklyn College. 


In an earlier paper, the writer gave a very general method for constructing situations 
in whicn the likelihood ratio was a poor test. That is, very general families of density dis- 
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tributions were constructed and hypotheses, both simple and composite, set up within these 
families such that the probability of rejecting any element of the hypothesis was greater 
when that element was true than when any element of the alternative was true. This was 
so with reference to a suitable critical region that was part of the construction. (Previously 
both Rubin and Stein had given simple, restricted examples in which the likelihood ratio 
test was poor.) In these constructions, the dimensionality of the parameter space was 
always the same as the dimensionality of the variate space. In the present paper, the writer 
shows how by building upon the earlier constructions and using product spaces, it is pos- 
sible to give examples for which the likelihood ratio test is poor and these two dimensionali- 
ties are different. 


6. An Empirical Sampling Method. E. L. Cox, Dugway Proving Ground. 


Certain sequences of dependent trials where there are at least two possible outcomes at 
each trial give rise to such probability distributions as 


P(X = )= [((N mn T nt N)[A” t T"/(n vile t)!], or P(X a 
= [((N — T)at/N7)(T + n — t)[A*~t T*-1/(n — t)!], 


etc., where V and n are population size and sample size respectively and T and ¢ are the 
number of individuals possessing a certain attribute in the population and sample respec- 
tively. The individual probabilities from such distributions may be difficult to evaluate 
explicitly. A method using IBM punched cards has been devised to approximate the prob- 
ability distributions. A set of random numbers X are punched from a deck containing a 
random number table. These random numbers are transformed by machine methods to the 
desired random number set ¥ under the relation Y = Y (mod N). The random variables 
Y are used in the machine operations from which the probability approximations can be 
evaluated. Some worked examples are shown. 


7. Distribution of Semi-Definite and of Indefinite Quadratic Forms. Jonn Gur- 
LAND, Iowa State College. 


The problem considered is that of finding the distribution function of L} \,X? where 
X,,X.2, +--+ ,X, are independent normal random variables each with mean zero and unit 
variance. The method used to solve this problem extends a result of Bhattacharya (Sankhya, 
1945). A convenient expansion of the characteristic function and then an application of 
the inversion formula yields a series in Laguerre polynomials which converges to the exact 
distribution. For definite or semi-definite forms, this expansion is a special case of previous 
results (submitted to the Annals of Mathematical Statistics) obtained by the author. For 
indefinite forms, the series obtained applies to the case where the number of positive \’s 
or the number of negative \’s is even. There are no other restrictions on the \’s 


8. Asymptotic Properties of Ideal Linear Estimators. Cart ALLEN BENNETT 
General Electric Company. 


The problem studied is the determination of the asymptotic properties of ideal linear 
estimators of the location and scale parameters of a continuous univariate distribution 
function. An ideal linear estimator is defined as a linear combination of the order statistics 
which provides an unbiased estimate of a particular parameter with minimum variance. 
It is shown that if the probability density function is continuous and nonvanishing over 
the range of possible values, then the ideal linear estimators of the location and scale param- 
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eters based on all the order statistics are asymptotically efficient. The limiting form of 
the coefficients of these estimators is obtained. The final results are obtained in a sufficiently 
general form so that they can be applied to truncated samples and the effect of such trunca- 
tion studied. Certain situations in which the probability density function is discontinuous 
and the estimators ‘‘super-efficient”’ are dealt with as limiting cases of a truncated sample. 
(This paper summarizes the principal developments in a thesis of the same title submitted 
to the University of Michigan in February 1952 in partial fulfillment of the requirements 
for the degree of Doctor of Philosophy.) 


9. Bias in Estimation by Interval. M.C. K. Tweepter, Introduced by B. Harsu- 
BARGER, Virginia Polytechnic Institute. 


Suppose that a sample of n values is drawn from a constant univariate population I 
(not necessarily continuous). It is well known (cf. K. R. Nair, 1940) that the closed interval 
extending from the rth smallest to the rth largest observation can, with an appropriate 
choice of r, be used as a confidence interval for the median of II. Under a condition which is 
nearly always satisfied in practice, no quantile of I is more likely te be covered by this 
interval than is the median. Apart from the slight restriction referred to, this seems to be 
a very satisfactory property. In general, a confidence interva! might be more likely to 
cover some incorrect value of the parameter than to cover the true value, and it would then 
represent a case of “‘bias’”’ in a sense which appears not to have been discussed previously. 
Let A(6 | 6’) denote the probability that a confidence interval will include @ when @’ is the 
true value. It is proposed that the dependence of A(@| 6’) on 6, at various constant 6’, 
should be considered in setting up definitions of desirable properties of confidence intervals. 
This idea is illustrated by a number of examples. Neyman’s (1937) discussion of ‘“‘unbiassed”’ 
systems of confidence intervals is based on the dependence of A(@ | 6’) on 6’ at constant 6, 
which is not necessarily equivalent to the new proposal. 


10. Completeness of the Order Statistics in the Nonparametric Case. D. A. 8. 
Fraser, University of Toronto. 


In sampling from an unknown distribution on the real line, the order statistics are known 
to be sufficient. Here they are shown to form a complete statistic for any class of absolutely 
continuous distributions which contains at least all uniform distributions over finite num- 
bers of intervals. In U M V unbiased estimation in the nonparametric case, the Lehmann- 
Scheffé theorem remains valid so long as the order statistics remain complete for these 
distributions for which the parameter exists. Any estimable parameter which exists for 
bounded distributions has then a U M V unbiased estimate. 


11. Parameter-free and Nonparametric Tolerance Limits: The Exponential 
Case. Leo A. GoopMAN, University of Chicago. 


Parameter-free tolerance limits are developed based on the first r ordered observations 
x} S % S++: S x, from a sample of n exponentially distributed variates. Comparisons 
between these tolerance limits and nonparametric limits are made. For example, although 
Ejx,| = E{2,;™, x:/n?}, we find that a sample of 459 observations is required to obtain 
99 per cent tolerance limits at probability level 99 per cent if the limits are the nonpara- 
metrie (2; , ©); and only 122 observations are required if the limits are the parameter-free 
(2,", x;/n?, ©). The exact behavior of the coverage is investigated; for example, if the 
parameter-free limits are of the form ({2,;2, 21 + (n — r)z,]c/r, ©), where c is a constant, 
the distribution of the coverage is obtained. The asymptotic behavior of the coverage is 
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also studied. It is shown that the parameter-free limits are asymptotically better than 
the nonparametric limits (z; , =) whenever 1 — en/r > 8 for 1008 per cent tolerance limits. 
Tables of numerical comparisons are made. The method of obtaining parameter-free toler- 
ance limits may be generalized to the case where the distribution of the observations has 
the invariance property under change of scale. 


12. On Estimates Whose Distributions Have a Weak Invariance Property. 
Leo A. Goopman, University of Chicago. 


Let X be a real-valued random variable whose distribution depends on an unknown 
real parameter 6. Suppose 6E |X | 6}/E{X? | 6} = A, where A is known. Then among all 
statistics of the form aX, where a is a constant, the mean square error E{(aX — 6)? | 6} 
is minimized uniformly when a = A. Also, among all statistics of the form aX, the only 
one which is unbiased with respect to the loss function W(@, f(X)) = (f(X) — 6)?/@ is AX, 
the statistic which minimized the mean square error. As a corollary of this result we have 
the following. Suppose X is an unbiased estimate (in the usual sense) of 6 whose coefficient 
of variation is a known constant V. Then the estimate X/(V? + 1) has a smaller mean 
square error than the estimate X. The relative improvement in the mean square error 
by using X/(V? + 1), rather than X, is V?/(V?2 + 1). The estimate X/(V? + 1) is unbiased 
with respect to the loss function W (6, f(X)). Another corollary of this result is related to the 
problem of the estimation of the scale parameter of a population whose form may not be 
known but the ratio of the first and second moments is given. Applications of this result are 
presented which are related to the problem of estimating the standard deviation and vari- 
ance of normal variates, the range of uniform variates, and other problems. 


13. The Admissibility of Certain Statistical Tests. (Preliminary Report.) Ericu 


L. LEHMANN AND CHARLES M. Stern, University of California, Berkeley 
and University of Chicago. 


Let wo , w: be one-parameter families of probability measures, each element of w; being 
obtained from a single element of w; by a translation. Under weak restrictions, the unique 
most powerful test of given size among all those invariant under translation is admissible 
in the class of all tests. This is true in particular for the simplest applications of (central 
or noncentral) Student’s ¢ test: X, , --- , X, are independently normally distributed with 
unknown mean é and unknown variance o?. An admissible test for Hy : &/o = 5) against 
H, : &/o = 6; > 6 (where 59, 6; are given numbers) is given by: Reject Hy if 2X; = 
kV/ZX?. 


14. Some Experimental Designs for Comparative Life-Testing. ALLAN Brrn- 
BAUM, Columbia University. 


Consider two populations of objects whose ‘‘durations of life’”’ ¢ have respective densities 
f(t, 0:) = (1/6;) exp [—t/@;], 7 = 1, 2. Let M objects from each population be observed 
continuously, with each item failing immediately being replaced by another from the same 
population. Let Y; = 1 or 0 as the jth failing item is from the first or second population. 
Let p = Prob {Y; = 1} andy = 6,/6 . Then p = 1/1 + y. The sequence of observed values 
Yi» ¥2,*** , may thus be used in sequential or nonsequential procedures to give tests or 
interval estimates of 6,/62 . M may be altered during experimentation without invalidating 
these procedures. Observed ‘‘waiting times”’ t; between observations y;-; and y; may be 
used’ to give independent confidence statements concerning min (9 , 62) or max (6; , 42). 
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15. Existence of Invariant Minimax Procedures. Herman Rustin, Stanford 
University. 


An unpublished result of Hunt and Stein states that, under certain specified conditions, 
if there is a most stringent test, then there is a most stringent invariant test. This theorem 
has been translated into a theorem about decision functions and, by a method similar to 
theirs, it is shown that, under certain restrictions, given any statistical procedure, there is 
an invariant procedure whose ‘‘maximum”’ risk is no greater than the ‘‘maximum”’ risk of 
the given procedure. Thus, under those conditions, one may restrict a search for minimax 
procedures to invariant procedures. A similar though somewhat less general result has 
been obtained by Peisakoff. 


16. A Simple Sequential Procedure for Testing Statistical Hypotheses. (Pre- 
liminary Report.) Cota Kuer Tsao, Wayne University. 


In applying Wald’s sequential probability ratio test, one has to choose a definite alterna- 
tive hypothesis which may restrict its application. In this paper, a different simple sequen- 
tial test procedure is suggested. In testing a simple hypothesis, say f(z) = fo(x), one needs 
only to divide the whole space into three mutually exclusive and exhaustive zones: the zone 
of acceptance R, , the zone of indifference R. and the zone of rejection R; . Random ob- 
servations are drawn successively. At each stage, the number of observations falling in 
each of these three zones will be counted. Denote by n; the number of observations falling 
in the ith zone (i = 1, 2, 3) after the nth observation has been drawn and let =n; = n. 
Continue to draw observations as long as n; < k and n; < k (where k is a positive integer). 
The experiment is discontinued as soon as max (m , n;) = k. The null hypothesis is accepted 
if mn; = k or rejected if n; = k. Distribution of the sample size, the m. g. f. and the best critical 
region are obtained. It is also shown that these tests are consistent. Uniformly most power- 
ful and unbiased tests are discussed. For properly chosen zones, they are, on the average, 
more efficient than some of the standard fixed sample tests. Furthermore, these tests can 
also be applied to the nonparametric case. (This work was supported in part by the Office 
of Naval Research.) 


17. Asymptotic Properties of the Robbins-Monro Process. JosepH L. HopGes, 
Jr. AND Ertcu L. LEHMANN, University of California, Berkeley. 


Robbins and Monro have proposed a class of stochastic processes which converge (in 
the sense of mean square) to the root of a regression equation. The rate of convergence of 
the mean square errors of these processes is shown to depend only on the local properties 
of the model at the root, and thus the asymptotic optimum problem may be solved by 
considering the easy problem of linear regression with constant variance. Asymptotically, 
the best Robbins-Monro scheme cannot be improved. As an application, an asymptotically 
optimum solution of the bio-assay problem is provided for any percentile. 


18. Some Notes on the Application of Sequential Methods in the Analysis of 
Variance. N. L. Jonnson, University of North Carolina. 


This paper is concerned with points of detail arising in the application of sequential 
methods to problems which, if fixed sample sizes were used, would be suitable for analysis 
of variance techniques. Using results due to G. A. Barnard and D. R. Cox it is shown that, 
in the case of systematic models, a valid sequential test (not dependent on arbitrary weight- 
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ing functions) may be based on the sequence of deviance ratios calculated at various stages 
as the experimental design is built up. Similar procedures, in the case of random models, 
do not always terminate with probability one. Alternative forms of design, such that this 
difficulty will not arise, are suggested and a short table of critical limits for deviance ratios 
is provided. (This research was supported by the United States Air Force under Contract 
AF18 (600) -83.) 


19. The Covariances of Frequencies from a Multinomial Distribution under a 
Sequential Sampling Rule. M. C. K. Tweepre, Introduced by B. Harsu- 
BARGER, Virginia Polytechnic Institute. 


Suppose that a constant multinomial distribution is sampled repeatedly, with replace- 
ment, only until some non decreasing linear function Z of the frequencies reaches some 
prescribed positive value ¢. That is, if z; is the observed frequency in group i (i = 1 to N), 
and Z = 2%, (dizi), where each \; is a nonnegative real constant, sampling is stopped 
as soon as Z 2 ¢. If x, is the probability that any one observation will fall in group 7, the 
expectation of the final sample size n is E(n) ~ ¢/ DP (Aim ),and E(z;) = r,E(n), (cf. Black- 
well). Wald’s fundamental identity can be generalized to E{e~’(Z yf re~'*)-"| = 1, where 
U = % (2\t;), and (assuming that the final Z exceeds ¢ by a negligible amount) it can be 
shown to follow that cov (x; , xj) = [frs/Z(Aw)]{ [ej D(A2w) /(Z Aw)2] — wi(AG + Az) /VAw) I+ 
6:;{, Where 6;; = Oif 7 ¥ 7, 6;; = 1. By changing the values of the constants (or scores 
Ar, A2, °°: , Aw, the statistical properties of estimators and other statistics based on 
samples thus obtained can be varied, and can sometimes be made to approximate to var- 
ious desiderata. 


20. The Totality of Transformations Leaving a Family of Normal Distributions 
Invariant. (Preliminary Report.) Ertch L. LEHMANN AND CHaARLEs M. 


Srern, University of California, Berkeley and University of Chicago. 


If X is a finite dimensional real linear space, X* its adjoint, 6 C X*, and uw ao-finite 
measure in X, aset of densities ¥(@) exp (@z) for 6 « 9 is called an exponential family. If f is 
a 1-1 function on X onto another linear space X’ taking the exponential family into 
another, (@’, »’) then the induced function f: @ > 0’ can be extended uniquely to a 1-1 
affine function g on the affine space T generated by 9 onto 7”. If Lis the linear space parallel 
to 7, then the projection of f on L* is almost linear, in fact it coincides almost everywhere 
with the inverse of the adjoint of the linear function on Z determined by ¢. The invariants 
of g are obtained. These considerations are specialized to normal distributions, but the re- 
sulting algebraic problem has been solved only for special cases. There are f’s which cannot 
be generated by an affine transformation of the normal random vector. 


21. A Nonparametric Two-Sample Life Test. Bensamin Epstein, Wayne Uni- 
versity. 


Let two samples S; and S. each of size n be put on a life test. For any fixed m define 
the random variable Z,, , where Z,, is the number of trials when at least m failures have 
occurred in both S; and S, for the first time. Then the rule Z,, > C,, , a sufficiently large 
integer 22m, gives a procedure for rejecting the null hypothesis that S; and S: are drawn 
from the same population. The power, expected number of observations to make a decision, 
and other pertinent features of this test are being investigated experimentally for the 


case of normal slippage for nm = 2(1)20 and m = 1, 2, 3. The work for m = 1 is related to 





ABSTRACTS 143 


recent slippage tests by Mosteller and Tukey. (Work sponsored by the Office of Naval 
Research.) 


22. The Large Sample Power of a Test Based on Dichotomization. Cu1a Kuer 
Tsao, Wayne University. 


Let x be a population having p. d. f. f(z) over a region R. Let R be divided into two 
mutually exclusive and exhaustive zones R, and R, . Suppose a random sample of size n is 
drawn from x. Let S be the number of observations of the sample falling in the zone R, . 
Then the distribution of the random variable S is given by the binomial distribution: 


n Na Bais ee ee ree 
b(s) ( Jor *, where p = Spr, f(x) dz. By the use of this binomial distribution, one can 
8 


test the hypothesis f(z) = f(x) against an alternative f(x) = f;(z). In order to obtain the 
maximum power for fixed n, one needs to determine an optimum zone R, . In this paper, it 
is shown that, for large n, the optimum zone R, is one which consists of all those points z 
where f,(z)/fo(x) 2 1. (Supported in part by the Office of Naval Research.) 


23. An e-Complete Class of Nonsequential Decision Procedures in the Finite 
Case. LioneL Weiss, University of Virginia. 


Suppose there is a finite set of possible decisions: d; , dz , --- , d% , and a finite number of 
possible distributions for the vector chance variable X, with density functions f(2r:1), 
f(z:2), +++, f(zin). Then, given any positive e, by a simple use of the Dantzig-Wald con- 
verse of the Neyman-Pearson lemma it can be shown that the following class of decision 
procedures is e-complete: Either d; or d; is rejected according to whether or not 


f(xis{i, j]) > La(r:i, j)f(zir), 


where the summation is with respect to r and extends over all r not equal to s[z, 7]. This 
sort of comparison is made k — 1 times, until only one decision is left, which is the 
decision finally chosen. The a(r:i, j) are predetermined constants which will in general 
depend on which decisions have already been rejected by the preceding comparisons 
The s{z, 7] are also predetermined. In the case where an invariance condition of the following 
form is imposed: a set of points 7'(X) containing X is associated with each X and the same 
decision must be made for any point in the set T7(X), then if we replace f(z:7) by sf(t:7) dt, 
where the integration extends over 7'(X), the decision procedures defined in the same way 
as above in terms of these new functions form an e-complete class among all procedures witli 
the invariance property. 


24. Approximate Distribution of Extreme Values of the Range. M. H. Betz 
AND Rosert Hooke, University of Melbourne and Princeton University. 


For a sample range that is considered extreme it might be expected that the order statis- 
tics that define it would be approximately independent of each other. Further, a consider- 
ation of the cumulative distribution function for the range in samples from a normal popu- 
lation shows that, for extreme values, the distribution is closely normal. These features 
suggest: (i) the investigation of the distribution of the range under the assumption of inde- 
pendence, and a comparison of the probability obtained from this distribution with that 
obtained from the exact distribution when the range is extreme; (ii) the development of a 
practical procedure for approximating to critical probabilities associated with the range, 
based on the assumption that the order statistics in question are normally distributed 
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around their expected values with characteristic variances. Relations have been obtained 
between probabilities computed in the above ways for a variety of distribution functions. 
Some interesting limiting results have also been examined for the ratio of the probability 
that the range will exceed a given value to that found under the assumption of independence, 
merely, as the given value increases. As a practical measure, a good approximation to the 
exact probability that the range will exceed an extreme value appears to be obtained by 
adopting the normal procedure mentioned, with the required low moments estimated from 
the data of a number of independent samples, provided that an appropriate co-factor, 
depending on the sample size, is attached to the probability so found. The method can be 
readily extended to the more general case where a comparison between two linear functions 
of the order statistics is made the basis of a statistical test or other procedure. (Work 
done under the sponsorship of the U. S. Office of Naval Research.) 


25. Estimate of the Interval Rate for Actuarial Calculation. (Preliminary Report.) 
JOSEPH BERKSON, Mayo Clinic. 


If the functional form of the survivorship curve is known, a 7-year survival rate can be 
estimated by ‘‘fitting’’ the function to the observations, using some method such as maxi- 
mum likelihood. If the form is not known, the actuarial method is used. This consists of 
subdividing the 7-year interval into subintervals, estimating the probability of survival 
in each interval, and obtaining the 7-year rate as the product of these. The case is con- 
sidered in which the individuals of the population concerned are identified, and informa- 
tion concerning death or survival is obtained by periodic survey. It is assumed that in any 
subinterval the survivorship may be considered linear. Four estimators *~ = 1 — p, the 
probability of dying in the interval, are considered and compared: (1) D/[N — W/2}, 
(2) D/[N — L/2}, (3) the solution of g = D/[N — L/(2 — q)] and (4) the solution of 
g = D/|N — T{L.(1 — ti) /( — Gti)}] where N is the number observed living at the begin- 
ning of the interval, D is the number of observed deaths during the interval, Z; is the num- 
ber last observed living during the interval at time ¢; after the beginning of the interval, 
L = YL; and W is the total number ‘‘withdrawn’’ during the interval. 


26. Simultaneous Confidence Interval Estimation. R. C. Bose anp 8. N. Roy, 
University of North Carolina. 


Let yi(t¢ = 1,2, --- , 2”) be an observed set of random variables whose joint distribution 
depends on the unknown parameters 6;(j7 = 1, 2,--- , n). Let Mk = f:(@: , 02, °** , On), 
k = 1,2,--- , be a given finite or infinite set of parametric functions. Suppose it is possible 
to find a set of functionsyk(y: , ye,---: , Y¥yw , Mk) such that Yx S c implies di(y: , y2 , +--+ 5 YW) 


SM, S Gly, Yr, -+ yy). Let W be the intersection of the regions ¥, Sc for k =1,2 


If Prob {(y:, y2, °°: yw) € W} = 1 — wand is independent of the parameters, then 1 — a is 


also the chance that all the statements ¢:, S My S @2: are simultaneously true. Some results 
due to Scheffé and Tukey are derived and other examples of simultaneous confidence inter- 
val estimation (especially in connection with factorial experiments) are given. Applications 
to multivariate analysis will be given separately. 


27. On a Set of Simultaneous Confidence Interval Statements in Multivariate 


Analysis of Variance. S. N. Roy anp R. C. Bost, University of North 
Carolina. 


In an earlier pape. (S. N. Roy, ‘‘On a heuristic method of test construction and its use 
in multivariate analysis,’ to appear in Ann. Math. Stat.), among other things, the following 
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test criterion was proposed: Suppose there are k p-variate normal populations with a 
common covariance matrix L and p-dimensional mean column vectors (i = 1, --+ , &), 
and random samples of sizes n; + 1 drawn from them, and let S* and S stand for the ‘‘be- 
tween” and “within” covariance matrices of the k samples and n = X‘t; n, . Then for the 
hypothesis H(é;, = --- = &) = Ho, the critical region (of size a) proposed was: 
0, = O(a, p, k — 1, n), where P[@, = O(a, p, k — 1, n) | Hol = a, g= min (p, k — 1) and 
6, is the largest root of the determinantal equation: | S* — 6S | = 0. In this paper, denoting 
by z; the p-dimensional mean column vector for the ith sample and inverting, in the usual 
manner, the process of construction of the above type of test, the following set of simul- 
taneous confidence intervals (with a joint confidence coefficient 1 — 2) is obtained: 
Trey Aims + 1)? u(x, — §;) S [ko(a, p, k, n)u'Su)'? for all nonsingular column vectors 
# and all sets of \,(i = 1, 2, --- , k) subject to: LT rn = 1. On the left hand side of the 
confidence interval statement the absolute value of a scalar is taken and on the right-hand 
side the positive square root of a positive quantity. 


28. On a Relevant Set of Simultaneous Confidence Interval Statements in 
Discriminant Analysis. S. N. Roy anp R. C. Bose, University of North 
Carolina. 

In the setup of the previous abstract suppose that we are interested in a set of simul- 
taneous confidence interval statements on #’(£; — &;) (for all nonsingular p-dimensional 
column vectors # and alli # j = 1,2, --- ,k). Fora given pair (7, 7) it is noted that 
(ny Sa 1)(ne + 1) , , ’ ’ 

max, vw [zi — 2 — (Ei: — &)[z; — 2; — (&; — Ela + w'S 
m+m+2 oul 1 &s)I[z, i g5 oj/ie #2 
_ et et 

mt n+ 2 


rS“(x; — Xj — &i Ez; — 2; — & + &;) = Ti;; 


and that this is distributed as Hotelling’s T with D.F. p andn+ 1 — p, wheren= D*., nj . 
This is used to obtain finally the following set of simultaneous confidence intervals (for 
all# andi #j = 1,2, --- ,k) witha joint confidence coefficient 1 — a:#’(2;-— 2; — §&; + §,) 
< (Au’ Su]? where 4 is given by the simultaneous probability: 


PIT; #j =1,-+-,k) S$ Ol = 1—a. 


While each 7;; , separately, is distributed as a Hotelling’s T with D.F. p and n + 1 — p, 
the 7;;’s are not distributed independently and, before we could use the above, the distri- 
bution of max;; 7; has to be worked out. This is unlike the previous case where, on the 
null hypothesis, the distribution of @, is known and for which the construction of the neces- 
sary tables is now under way. It may be noted that the bunch of confidence intervals offered 
here is a subset of what is offered in the previous abstract. 


29. The Separation of Product Error and Inspection Error in Sampling by 
Attributes. C. C. Craic, University of Michigan. 


Though the separation of instrumental variability from that of the product being meas- 
ured in case the measurements are values of a statistical variable of the continuous type 
has been dealt with, the analogous problem in case the inspection results consist only of 
the numbers of articles found acceptable in samples appears to have beer neglected. The 
present paper gives the results of a straightforward approach to the problem including 
among the cases considered the realistic one in which the inspector rejects good items and 
accepts bad ones with unequal probabilities for the two kinds of error. Moment estimates 
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which are also maximum likelihood estimates for these probabilities and for the process 
average fraction defective together with their large sample variances are found. However 
this appears to be an instance in which very large samples are required to give the asymp- 
totic results practical value. 


30. The Relation Between Fisher’s Discriminant Function and Wald’s Classifi- 
cation Statistic. H. Leon Harter, Wright-Patterson Air Force Base. 


A problem frequently encountered in statistics is that of classifying an individual into 
one of two populations, I, or Il, . Ordinarily, complete information about the populations 
is not available, but scores on p tests are known for samples of N; individuals from 1, 
and N, individuals from I, , as well as for the individual under consideration, a member 
of population I. It is assumed known that II is identical with either I, or I, . It is required 
to test the hypothesis #7, : It = Il, against the single alternative H, : Il = Il, . Fisher pro- 
posed a discriminant function X and Wald a classification statistic V for use in solving prob 
lems of this sort. In this paper it is shown that the relation between X and V takes the 
form V = aX — 8, where a depends on the sample sizes .V; and N2 and 8 on the hypothesis 
under consideration. 


31. On the Distribution of the Ratio of the ith Observation in an Ordered 
Sample from a Normal Population to an Independent Estimate of the 
Standard Deviation. K. C. S. Prtuar anp K. V. RAMACHANDRAN, Uni- 
versity of North Carolina. 


Let 2, --+ 2, be a sample of observations taken from a normal population arranged in 
the ascending order of magnitude and s, an independent estimate of the standard deviation 
with v degrees of freedom. The distribution of z;/s is obtained as a series whose terms are 
Beta functions. The series is observed to be useful in small samples and with the help of 
the tables of incomplete Beta functions, the 5 per cent and 1 per cent levels of significance 
for z,/s are being computed for small values of n and values of v up to 20. 


32. Cyclic Solutions of Symmetrical Group Divisible Designs. S. 8S. SHRIKHANDE, 
University of Kansas. 


An incomplete block design with v treatments each replicated r times in b blocks of size k 
is said to be group divisible (Bose and Connor, Ann. Math. Stat., Vol. 23 (1952), pp. 367 
383) if the treatments can be divided into m groups of n each so that any two treatments 
belonging to the same group occur together in A; blocks, while any two treatments from 
different groups occur together in A, blocks. The design is connected if A, > 0. A set 
(d; , dy, -+- , d,) of integers is called a cyclic difference set for the symmetrical group 
divisible design (r = k and hence b = v) if block 7 is given by the set 


(qi; +i-—1,d,+%-—1,---,d,+i —1) 


reduced mod vr(i = 1, 2, --- , v). Two theorems on the impossibility of cyclic solutions are 
proved. Theorem 1. Let d be an odd positive factor of m, and f be a prime factor of the 
square-free part of r? — dA» wv. Then there is no cyclic solution of the design if (i) d = 3 (mod 4) 
and (—d/f) = —1 or (ii) d = 1 (mod 4) and (d/f) = —1. Theorem 2. Let c be an odd positive 
factor of n when (c, m) = 1 and let ¢ be a prime factor of the square-free part of r — A. 
Then there is no cyclic solution of the design if (1) ¢ = 3 (mod 4) and (—c/@) = —1 or (ii) 
c = 1 (mod 4) and (¢/¢) = —1. These results are obtained by using a result due to Halland 
Ryser (Canadian Jour. Math., Vol. 3 (1951), pp. 495-502). 
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33. Measures of Association for Cross-Classifications. L. A. GooDMAN AND 
W. H. Kruskat, University of Chicago. 


If a population is cross-classified by two classifications, one often desires a single number 
which describes the degree of association between the two classifications. Given such a 
measure of association based upon the population proportions, one may wish to estimate 
it or make tests about it on the basis of a sample drawn from the population in a specified 
way. Standard measures of association are described and criticized. A number of other 
measures are suggested and motivated in the frameworks of models for predictive behavior 
which seem typical of the uses to which cross-classifications are put. For example, one 
measure is based on the relative improvement in the prediction of one classification as the 
other is or is not known. Also discussed are measures of partial and multiple association if 
there are more than two classifications. The asymptotic sampling theory for certain meas- 
ures and methods of sampling is discussed. 


34. Calculating Longevities from Sample Composition. Leo A. GoopMan, Uni- 
versity of Chicago. 


Sometimes it is desired to compare the longevity of two or more types of equipment 
under operational conditions where it is not convenient to identify or keep records of 
individual items. Such a comparison can be made by adopting certain replacement rules 
and observing their effect on the composition of the population. For example, when only 
two types are being compared, the replacement rule might be that when an item fails, its 
replacement will be of the opposite type. Then the composition of the population at any 
time (i.e., the proportions of the different types among all the items in use) will depend 
upon the original composition of the population, the time elapsed, and the longevities 
of the different types. Since the original composition and the elapsed time are known, by 
determining the new composition of the population (either by total inspection or by draw- 
ing a sample from the new composition) we would expect to obtain information concerning 
the longevities of the different types of equipment. Replacement policies are studied which 
satisfy certain logistics requirements as well as the requirement that a given number of 
items be in operation at all times. For certain given logistics requirements, optimum re- 
placement rules are developed. The problem of estimating and testing hypotheses concern- 
ing the relative longevities of K 2 2 types of equipment is studied for the case where the 
equipment is subject to a constant risk. It is then shown that if the replacement rules are 
used for a long period of time, the results obtained under the assumption of constant risk 
remain valid even when the risk is not constant as long as the equipment has a finite life 
span. When information about the stock is also available (i.e., how many items have been 
replaced), estimates and tests of hypotheses concerning the longevities of the A types of 
equipment are obtained. A numerical illustration is given. 


NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


Dr. M. H. Belz has now returned to the University of Melbourne after spend- 
ing the first semester at Princeton. During the period of his sabbatical leave he 
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attended the Conferences of the International Statistical Institute and the 
Biometric Society in India, in December, 1951, and later spent some months in 
England, mainly in the Universities of Cambridge and London. In April, 1952, 
he made an extensive investigation of industrial applications of statistics in 
Britain, in the company of Professor W. Allen Wallis. In August, 1952, he was 
a participant in the annual International Harvester Program for University 
Personnel in Chicago. 

Mr. Willard H. Clatworthy has joined the staff of the Statistical Engineering 
Laboratory, National Bureau of Standards, Washington, D. C. 

Mr. George G. den Broeder, Jr. has accepted a position as Mathematical 
Statistician with the National Bureau of Standards at Corona, California with 
the Missile Evaluation Section of the Guided Missile Division. 

Mr. John B. Ellery has been appointed to the faculty of the Department of 
Speech and Dramatic Arts at the State University of Iowa to assist Professor 
E. C. Mabie in Experimental Aesthetics in the Theatre Program. 

Dr. H. A. Freeman of the Massachusetts Institute of Technology is spending 
the year 1952-53 at Princeton University. 

Mr. Harry H. Goode, formerly Supervisor of the Aerophysics Group and 
Chief Project Engineer, is now Director of the Willow Run Research Center, 
University of Michigan. 

Dr. Robert E. Greenwood has been released of his active duties as Lieutenant 
Commander in the United States Naval Reserve and has rejoined the faculty 
of the University of Texas as Associate Professor of Applied Mathematics. 

Dr. E. J. Gumbel has been appointed Visiting Professor for Mathematical 
Statistics by the Free (West) University, Berlin for the summer term of 1953. 

Dr. Carl Hammer, formerly Research Associate at the Bureau of Applied 
Social Research, Columbia University, has accepted the position of Senior 
Research Engineer at the Franklin Institute of the State of Pennsylvania. 

Mr. Robert G. Hoffman, formerly of the Department of Biostatistics at the 
School of Public Health, University of North Carolina, is now at the School of 
Public Health, University of Michigan. 

Mr. John F. Hofmann has transferred from Dugway Proving Grounds, 
Chemical Corps installation in Utah to the National Bureau of Standards 
Laboratory at Corona, California, where he is working in the Missile Evaluation 
Section of the Missile Development Division. 

Dr. Leo Katz, who is on sabbatical leave from Michigan State College, is at 
the Statistical Laboratory, University of California, Berkeley, California. 

Mr. Marvin M. Lavin, formerly of the Institute for Air Weapons Research, 
the University of Chicago, and of the Weapons Systems Evaluation Group, 
Office of the Secretary of Defense, is now a member of the Aircraft Division, 
the Rand Corporation, Santa Monica, California. 

Mr. Harry D. Levine has accepted a position as Quality Control Engineer 
with Automatic Musical Instruments, Inc., Grand Rapids, Michigan. 

Mr. Edward A. Lew, Associate Actuary of the Metropolitan Life Insurance 
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Company will assume the position of Associate Actuary and Statistician upon 
the retirement of Dr. Louis I. Dublin of the Metropolitan Life Insurance Com- 
pany. 

Dr. Joe J. Livers has resigned as Professor of Mathematics at Montana State 
College, Bozeman, and has accepted a position as Statistician in the Engineer- 
ing Department of the Boeing Airplane Company at Seattle, Washington. 

Mr. Duncan C. McCune has left Purdue University where he was Research 
Associate in the Statistical Laboratory and is now a Quality Control Engineer 
at Tubular Products Division of the Babcock and Wilcox Company, Beaver 
Falls, Pennsylvania. 

Mr. Paul Meier has left the Forrestal Research Center, Princeton, New Jersey 
to accept a position as Research Associate in the Department of Biostatistics, 
School of Hygiene and Public Health, Johns Hopkins University. 

Mr. William B. Michael, formerly of the Rand Corporation, is now with the 
Test Bureau of the University of Southern California at Los Angeles. 

Mr. Raphael Miller is now a teaching assistant with the Department of Math- 
ematics, Carnegie Institute of Technology, and is completing the requirements 
for his Ph.D. degree. 

Mr. O. B. Moan, formerly with Julius Hyman and Company in Denver, 
Colorado, is now working with International Business Machines in Chicago. 

Dr. Sigeiti Moriguti, who has spent the past-two years at the Department of 
Mathematical Statistics, University of North Carolina on a leave of absence, 
has returned to his former position of Assistant Professor at the Department of 
Applied Mathematics, Faculty of Engineering, University of Tokyo. 

Mr. A. Carl Nelson, Jr. who has been an instructor in Mathematics at 
Marshalltown, Delaware during the past academic year, is now working for 
a Ph.D. degree in Mathematical Statistics at the University of North Carolina. 

Dr. Bernard Ostle, formerly of the Statistical Laboratory at Iowa State 
College, has accepted a position as Associate Professor of Mathematics at 
Montana State College. 

Mr. Frank Proschan, formerly in the Statistical Engineering Laboratory of 
the National Bureau of Standards, is now supervisor of Quality Control in the 
Atomic Energy Division of Sylvania Electric Products, Inc. 

Mr. I. M. Sahni, who received his M. Com. degree at Leeds University, 
England, is a Research Officer in the Office of the Economic Adviser to the 
Government of India. 

Mr. Franklin E. Satterthwaite has joined the Operations Research Group of 
Arthur D. Little, Inc. 

Dr. Herbert Solomon has left the Office of Naval Research to assume the 
position of Associate Professor at Teachers’ College, Columbia University, 
New York City. 

Mr. Mortimer Spiegelman, formerly Assistant Statistician at the Metropolitan 
Life Insurance Company, was appointed Associate Statistician on the retire- 
ment of Dr. Louis I. Dublin, January 1, 1953. 





150 NEWS AND NOTICES 


Dr. L. L. Thurstone, formerly at the University of Chicago, is now Research 
Professor and Director of New Psychometric Laboratory at the University of 
North Carolina, Chapel Hill. 

Dr. Howard G. Tucker has been appointed to an instructorship in the Depart- 
ment of Mathematics of Rutgers University. 

Dr. W. R. VanVoorhis of Fenn College has been appointed Visiting Professor 
of Engineering Administration at Case Institute of Technology and is also a 
member of the Case Operations Research Group. 

Dr. Mason E. Wescott, formerly Assistant Professor in the Mathematics 
Department at Northwestern University, Evanston, is now Professor of Applied 
Statistics in the Mathematics Department of the University College at Rutgers 
University, New Brunswick. Dr. Wescott is on leave of absence until February 
1, 1953 to serve on a team to India by the Technical Assistance Administration 
of the United Nations at the request of the Indian Government. 

Dr. Ruric E. Wheeler has accepted a position as Assistant Professor in the 
Department of Mathematics at the Florida State University, Tallahassee. 

Professor Jacob Wolfowitz of Cornell University is on leave of absence and 
will be at the Institute for Numerical Analysis, Los Angeles, California, until 
January 31 and at the University of Illinois until June 5. 

Dr. Max A. Woodbury has accepted a position as Associate Professor in the 
Department of Statistics at the University of Pennsylvania. 

Mr. R. Kk. Zeigler, formerly employed by the Atomic Energy Commission, 
Oak Ridge, Tennessee, has accepted a position as Statistician, Los Alamos 
Scientific Laboratory, Los Alamos, New Mexico. 


Mr. Marvin Zelen has resigned his position with the Experimental Towing 
Tank, Stevens Institute of Technology, and accepted a position as Mathematical 
Statistician at the Statistical Engineering Laboratory, National Bureau of 
Standards. 


or OO 


The Department of Statistics, University of Melbourne 


This Department, created by subdivision from the Department of Mathe- 
matics in 1948, has three main objectives: (i) to provide a wide range of training 
courses to undergraduates in various Faculties as well as extra-curricular courses 
to research workers, (ii) to conduct research in Statistics, and (iil) to advise and 
assist other University Departments in the laying out of experiments and their 
analysis. 

Elementary courses are offered regularly to students in Science, Engineering 
and Forestry while short courses have been given to specialist students in 
Optometry, Bacteriology and Biochemistry. A post-graduate non-credit course 
on ‘Statistical Methods for Research Workers,” extending throughout the 
academic year, has been a teaching feature of long standing and, in 1953, a 
second post-graduate course on ‘‘Advanced Techniques” will be introduced. 
The mathematical courses in Statistics are taken in the Faculty of Arts through 
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the subjects ‘‘Theory of Statistics, Parts I and IT,” and “‘Advanced Statistics.” 
Students working towards a pass degree take these for two years while honour 
students take them over three years, in each case after one year’s experience 
with Calculus and with continuing mathematical studies throughout their 
course. Practical work, both in the laboratory and in the field, is required of all 
students, involving computations and the collection and analysis of data relative 
to the student’s special field of interest. The first degree of Bachelor of Arts 
with Honours in the School of Mathematical Statistics is obtained after four 
vears of undergraduate work consisting mainly of Pure Mathematics and Statis- 
tics. The higher degrees of Master of Arts and Doctor of Philosophy may be 
obtained by carrying out research, essentially within the Department, although 
a limited amount of research performed elsewhere may be approved. 

Much of the Department’s activity is concerned with the advisory service 
mentioned. This is offered freely to experimenters in other University Depart- 
ments where statistical features are encountered. It consists in advising on prob- 
lems of design and analysis and, in certain cases, of carrying through the com- 
putations in consultation with the research worker concerned. Almost every 
experimental department within the University has been assisted in some form 
or other during the last few vears and the demand for help is steadily increasing. 
Assistance is also given to outside institutions conducting research, including 
governmental departments, and to industrial firms, the latter usually on a 
contractual basis. 

The Department is fortunate in having ¢lose contact with many ‘‘outside”’ 
professional statisticians, especially those of the Division of Mathematical 
Statistics, C. S. I. R. O. It provides a foeal point for discussions on special 
problems, seminars, and meetings of the local branch of the Biometric Society, 
and in the future it will endeavour to arrange visits and exchange lectureships 
with statisticians overseas. 

— 


Summer Sessions at Berkeley, California 


This vear’s summer program at the Statistical Laboratory of the University 
of California, Berkeley, California, consists of two sessions: June 22—August 1 
and August 3-Neptember 12. The faculty of the summer sessions will include 
Professor David Kendall of \’ agdalen College, Oxford University; Professor J. 
Neyman, Dr. T. A. Jeeves, and Mr. A. Shapiro of the Statistical Laboratory, 
University of California. 


The program includes two of the usual undergraduate courses in each session. 
In addition, Professor Kendall will give a new course in the first session, “Sto- 
chastie processes associated with population growth and with the theory of 
queues.” This course is designed to acquaint students with the probabilistic treat- 
ment of growth of populations subject to birth, death, immigration, mutation 
and aging. Professor Neyman will be available for consultations on work leading 
to higher degrees. 
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New Members 
The following persons have been elected to membership in the Institute 
September 1, 1952 to November 30, 1952 


Allen, William R., M.S. (Northwestern Univ.), Mathematician, Advisory Board on Simu- 
lation, The University of Chicago, 5822 S. Blackstone, Chicago 37, Illinois. 

Bastenaire, Francois, Statisticien diplomé, Univ. de Paris), Statistician, Institut de 
Recherches de la Siderurgie, 26 Boulevard Rov—Pavillons-sous-Bois (Seine) France. 
Black, Harold W., M.S. (Univ. of Michigan), Statistician, U. S. Public Health Service, 

Communicable Disease Center, 169 Eighth Street, N.E., Atlanta, Georgia. 

Dawson, Reed B. Jr., A.M. (Harvard Univ.), Graduate Student, Statistics, Harvard, 47 
Inman Street, Cambridge 39, Massachusetts. 

Descloux, Alfred, Mathématicien diplomé, (Ecole Polytechnique Fédérale Switzerland), 
Fellow in Mathematical Statistics, Institute of Statistics, University of North Carolina, 
Chapel Hill, North Carolina. 

Fagot, Robert F., S. B. Meteorology (Mass. Inst. of Tech.), Graduate Student, Depart- 
ment of Psychology, Stanford University, Stanford, California. 

Flatow, Paul, B.S. (Columbia Univ.), Associate, Department of Industrial Engineering, 
c/o Weeks, 456 Riverside Drive, New York 27, New York. 

Gilbert, Edgar J., M.A. (Harvard Univ.), Graduate Student, Harvard University, 201 B 
Holden Green, Cambridge, Massachusetts. 

Gorman, William Moore, B.A. (Trinity College, Dublin), Lecturer in Econometrics and 
Statistics, University of Birmingham, Highfield, 128 Selly Park Road, Birmingham 29, 
England. 

Guenther, William C., Ph.D. (Univ. of Wash.), Mathematician, Bureau of Standards, 
Corona, California. 

Hays, David G., B.A. (Harvard Univ.), Graduate Student, Harvard University, 9 Bow 
Street, Cambridge 38, Massachusetts. 

Hooke, Robert, Ph.D. (Princeton Univ.), Research Associate, Princeton University, 28 
Dorann Avenue, Princeton, New Jersey. 

Jebe, Emil H., Ph.D. (North Carolina State College), Associate Professor of Statistics, 
Iowa State College, Statistical Laboratory, Iowa State College, Ames, Lowa. 

Johnson, Norman Lloyd, Ph.D. (London), Visiting Associate Professor and Research Asso- 
ciate, University of North Carolina, 106, Richmond Road, Ilford, Essex, England, Insti- 
tute of Statistics, University of North Carolina, Chapel Hill, North Carolina (until- 
June, 1953). 

Kane, William E., M.A. (Univ. of Calif. at L. A.), Research Engineer, Northrop Aircraft 
Inc., Hawthorne, California, 18011 Martha Street, Encino, California. 

Kulldorff, Gunnar K. O., Fil. Kand. (Univ. of Lund, Sweden), Malmgatan 16, Malmé, 
Sweden, International House, Berkeley 4, California (until January, 1953). 

McLaughlin, George, B.S. (Univ. of Montreal), Graduate Student, University of Toronto, 
60 Sussex Avenue, Toronto, Ontario, Canada. 

Magidson, Jack, B.S. (Mass. Inst. of Tech.), Quality Control Manager, Metallizing Engi 
neering Company, Inc., 24 Nathan Hale Drive, Huntington, New York. 

Nesbeda, Paul, Ph.D. (Univ. of Pisa, Italy), Engineer, RCA Victor Division, 108-A Wall- 
worth Park, Haddonfield, New Jersey. 

Preston, Glenn W., M.S. (Yale Univ.), Consulting Engineer, Phileo Corporation, 1000 
Pleasant Lane, Oreland, Pennsylvania. 

Schaffer, Karl-August, Diplom-Mathematiker (Géttingen), Wirtschaftsmathematiker, 
Bielefeld, Hammerschmidtstr. 11, Germany. 

Schmidt, Howard O., B.S. (Duke Univ.), Metallurgist, Statistical Quality Control, Sprague 
Road at Bryant, Columbia Station, Ohio. 





CHICAGO MEETING 153 


Silver, Jack, B.A. (George Washington Univ.), Statistician, Department of the Air Force, 
4634 N. 28rd Street, Arlington, Virginia. 

Smuk, Walter, B.A. (Univ. of Toronto), Graduate Student, University of Toronto, 274 
Sixth Street, New Toronto, Toronto 14, Ontario, Canada. 

Sugihara, Seiji, M.A. (Columbia Univ.), Graduate Student, Columbia University, 423 W. 
118th Street, Apartment 53, New York 27, New York. 

Sutermaster, Charles R., M.S. (Carnegie Inst. of Tech.), Graduate Student, Project 


Mathematician, Department of Mathematics, Carnegie Institute of Technology, 
Pittsburgh 15, Pennsylvania. 

Thomasian, Aram John, M.A. (Harvard Univ.), Mathematical Statistician, Arthur D. 
Little, Inc., 63 John Mooney Road, Revere 51, Massachusetts. 


a 


REPORT OF THE CHICAGO MEETING OF THE INSTITUTE 


The fifty-fourth meeting and the fifteenth Annual meeting of the Institute 
of Mathematical Statistics was held in Chicago, Illinois, on December 27-30, 
1952. The meeting was held in conjunction with meetings of the American 
Statistical Association, the Econometric Society, the Biometric Society, and of 
the other members of the Allied Social Sciences Association. One session was 
co-sponsored by the American Statistical Association, and four were co-spon- 
sored by the Econometric Society. A Special Invited Address was given by 
Professor Jacob Wolfowitz on Asymptotic Estimation Theory. The following 222 
members of the Institute attended: 


Helen Abbey, N.S. Acton, Beatrice Aitchison, R. L. Anderson, T. W. Anderson, F. C* 
Andrews, E. E. Ard, H. E. Arnold, K. J. Arnold, L. A. Aroian, G. J. Auner, T. A. Bancroft, 


E. W. Barankin, Walter Bartky, W. D. Baten, Helen Beard, R. E. Bechhofer, Richard 
Berger, Joseph Berkson, A. J. Berman, Allan Birnbaum, Z. W. Birnbaum, David Blackwell, 
C. I. Bliss, Isidore Blumen, R. C. Bose, A. H. Bowker, R. A. Bradley, Dorothy Brady, 
A. E. Brandt, Bernice Brown, K. A. Brownlee, I. W. Burr, L. D. Calvin, S. D. Canter, 
Osmer Carpenter, Herman Chernoff, C. W. Churchman, W. G. Cochran, C. H. Coombs, 
L. M. Court, D. J. Cowden, E. L. Cox, P. C. Cox, C. C. Craig, J. F. Daly, Cuthbert Daniel, 
G. B. Dantzig, D. A. Darling, Besse Day, W. E. Deming, Alfred Descloux, Philip Desind, 
W. J. Dixon, J. L. Dolby, J. L. Doob, A. J. Dunean, D. B. Duncan, C. W. Dunnett, David 
Durand, A. M. Dutton, Meyer Dwass, W. F. Elkin, Lillian Elveback, Benjamin Epstein, 
A. V. Fend, Robert Ferber, David Frazier, John Freund, H. C. Fryer, H. H. Germond, 
C. P. Gershenson, M. A. Girshick, L. A. Goodman, Roe Goodman, Franklin Graybill, 
B. G. Greenberg, 8S. W. Greenhouse, F. A. Gross, W. C. Guenther, Lee Gunlogson, John 
Gurland, Paul Gutt, R. K. Haddad, J. 8. Hagan, K. W. Halbert, K. D. C. Haley, Max 
Halperin, M. H. Hansen, Boyd Harshbarger, H. L. Harter, H. O. Hartley, Mina Haskind, 
P.M. Hauser, W.C. Healy, Jr., F. M. Hemphill, L. H. Herbach, G. R. Herd, J. F. Hofmann, 
P. G. Homeyer, Harold Hotelling, W. G. Howard, C. H. Hubbell, E. R. Immel, P. E. 
Irick, J. E. Jackson, T. J. Jaramillo, T. A. Jeeves, R. J. Jessen, N. L. Johnson, P.O. Johnson, 
H. L. Jones, E. L. Kaplan, A. E. Karp, Harriet Kelly, D. G. Kendall, J. C. Kiefer, E. P. 
King, Jr., Tjalling Koopmans, C. F. Kossack, C. H. Kraft, William Kruskal, Solomon 
Kullback, G. K. O. Kulldorf, T. E. Kurtz, G. M. Kuznets, H. G. Landau, L. M. LeCam, 
Erich Lehmann, G. J. Lieberman, Rensis Likert, R. F. Link, 8. B. Littauer, G. F. Lunger, 
P. J. McCarthy, Brockway McMillan, W. G. Madow, E. S. Marks, Jacob Marschak, 
Margaret Martin, Paul Meier, P. L. Meyer, M. R. Mickey, Jr., Robert Mirsky, O. B. Moan, 
Marjorie Moore, J. E. Morton, Jack Moshman, F. C. Mosteller, L. E. Moses, W. V. Neisius, 
Jerzy Neyman, W. L. Nicholson, H. W. Norton, J. A. Norton, Jr., E. G. Olds, Paul 
Olmstead, Roby Oxtoby, W. R. Pabst, Jr., B. Z. Palmer, R. E. Patton, A. E. Paull, Edward 
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Paulson, Stefan Peters, B. E. Phillips, K. C. 8. Pillai, A. T. Reid, G. J. Resnikoff, Robert 
Roeloffs, J. H. Roseboom, H. M. Rosenblatt, Murray Rosenblatt, Irving Roshwalb, 8. N. 
Roy, L. J. Savage, E. D. Schell, M. A. Schneiderman, E. L. Scott, Esther Seiden, Jack 
Sherman, L. D. Simmons, W. R. Simmons, W. B. Simpson, Rosedith Sitgreaves, J. H. 
Smith, R. T. Smith, III, Milton Sobel, Herbert Solomon, P. N. Somerville, D. E. South, 
Mortimer Spiegelman, M. D. Springer, Arthur Stein, C. M. Stein, G. T. Steinberg, F. F. 
Stephan, H. L. Stubbs, J. V. Sturtevant, J. V. Talacko, J.G.C. Templeton, M. E. Terry, 
D. J. Thompson, R. M. Thrall, L. J. Tick, Leo Tornqvist, C. K. Tsao, J. W. Tukey, S. A. 
Tyler, D. L. Wallace, W. A. Wallis, Samuel Weiss, A. G. Whitney, D. G. Whitney, J. R. 
B. Whittlesey, Frank Wilcoxon, S.S. Wilks, M. J. Willis, Gerald Winston, Jacob Wolfowitz, 
L. A. Woodbury, W. J. Youden, R. K. Zeigler. 


The Program of the meeting was as follows: 


SATURDAY, DECEMBER 27, 1952 
1952 Council Meeting. 8:30 A.M.-9:50 A.M. 
Invited Addresses. 10:00 A.M.-11:50 A.M. 
Chairman: J. Wolfowitz, Cornell University and the University of California at Los 
Angeles. 


Applications of Martingale Theory to Statistics. J. L. Doob, University of Illinois. 
Information Theory. B. McMillan, Bell Telephone Laboratories. 


Combined Survey and Experimental Investigations. 12:00 noon-1:50 P.M. 


Chairman: T. A. Bancroft, Iowa State College. 

General Principles and Some Examples. R. J. Jessen, lowa State College. 

Uses by the U. S. Census. J. F. Daley, Bureau of the Census. 

Some Aspects of Theory. W. G. Cochran, Johns Hopkins University. 

Discussion: Donovan Thompson, Iowa State College and B. G. Greenberg, University 
of North Carolina. 


Application of Statistics to Astronomy. 2:00 P.M.-5:50 P.M. 


Co-Sponsor: American Statistical Association. 

Chairman: Walter Bartky, University of Chicago. 

The Structure of the Universe. B. Strémgren, University of Chicago. 

Analysis of Counts of the Extra-Galactic Nebulae in Terms of a Fluctuating Density Field 
D. N. Limber, University of Chicago. 

Theory of Clustering of Galaxies in a Static and in an Expanding Universe. J. Neyman 
and E. L. Scott, University of California, Berkeley. 

Discussion: J. W. Tukey, Princeton University and S. Chandrasekhar, University of 
Chicago. 


SUNDAY, DECEMBER 28, 1952 
Business Meeting. 8:30 A.M.-9:50 A.M. 
Probability and Behavior. 10:00 A.M.-11:50 A.M. 


Chairman: Herman Chernoff, Stanford University. 

Foundations of a Theory of Behavior. E. W. Barankin, University of California, Berkeley, 
and the National Bureau of Standards, Los Angeles. 

Personal Probability. L. J. Savage, University of Chicago. 


CONTRIBUTED PAPERS I. 12:00 noon-1:50 P.M. 


Chairwoman: E. L. Seott, University of California, Berkeley. 
Papers: (1) A Two-sample Multiple Decision Procedure for Ranking Means of Normal 
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Populations with Unknown Variances. (Preliminary Report.) R. E. Bech- 
hofer, C. W. Dunnett and M. Sobel, Cornell University. 
A Sequential Multiple Decision Procedure for Ranking Means of Normal 
Populations with Known Variances. (Preliminary Report.) R. E. Bechhofer 
and M. Sobel, Cornell University. 
On Quadratic Estimates of Variance Components. F. Graybill, Oklahoma 
A. and M. College and Iowa State College. 
Topics in Analysis of Variance: A. Optimum Properties of Tests for Model 
IT, B. Generalizations of Model IT. (Preliminary Report.) L. Herbach, Brook- 
lyn College and Columbia University. 
5) Further Examples for which the Likelihood Ratio Test is Poor. L. M. Court, 

Brooklyn College. 
An Empirical Sampling Method. E. L. Cox, Dugway Proving Ground. 
Distribution of Semi-definite and of Indefinite Quadratic Forms. J. Gurland, 
Iowa State College. 
Asymptotic Properties of Ideal Linear Estimators. (By title.) Carl A. Bennett, 
General Electric Company. 
Bias in Estimation by Interval. (By title.) M. C. K. Tweedie, Virginia Poly- 
technic Institute. Introduced by B. Harshbarger, Virginia Polytechnic 
Institute. 

(10) Completeness of the Order Statistics in the Nonparametric Case. (By title.) 
D. A. S. Fraser, University of Toronto. 

(11) Parameter-free and Nonparameteric Tolerance Limits: The Exponential Case. 
(By title.) L. A. Goodman, University of Chicago. 

(12) On Estimates Whose Distributions Have a Weak Invariance Property. (By 
title.) L. A. Goodman, University of Chicago. 


Controlled Stochastic Processes. 2:00 P.M.—3:50 P.M. 


Co-sponsor: Econometric Society. 

Chairman: J. Marschak, University of Chicago and Cowles Commission for Research in 
Economics. 

On Controlled Stochastic Processes. J. Kiefer, Cornell University. 

On Some Problems in the Theory of Dynamic Programming. R. Bellman, Rand Corpora- 
tion. 


Discussion: L. Térnqvist, University of Helsinki and Cowles Commission for Research 
in Economics. 


Individual Preference Functions. 4:00 P.M.-5:50 P.M. 


Co-sponsor: Econometric Society. 

Chairman: Abba P. Lerner, Roosevelt College. 

The Intransitivity of Individual Preferences. K. 0. May, Carleton College. 

Experimental Testing of a Postulate in the Theory of Choice. A. G. Papandreou, Uni- 
versity of Minnesota. 

Discussion: H. 8. Houthakker, Cowles Commission for Research in Economies, Ward 
Edwards, Johns Hopkins University, and Milton Friedman, University of Chicago. 


MONDAY, DECEMBER 29, 1952 
CONTRIBUTED PAPERS I. 8:00 A.M.-9:50 A.M. 


Chairman: J. Kiefer, Cornell University. 
Papers: (13) Admissibility of Certain Statistical Tests. E. L. Lehmann, University of 
California, Berkeley, and C. M. Stein, University of Chicago. 
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(14) Some Experimental Designs for Comparative Life-testing. Allan Birnbaum, 
Columbia University. 

(15) Existence of Invariant Minimax Procedures. (By title.) Herman Rubin, 
Stanford University. 
A Simple Sequential Procedure for Testing Statistical Hypotheses. (Pre- 
liminary Report.) C. K. Tsao, Wayne University. 

) Asymptotic Properties of the Robbins-Monro Process. J. L. Hodges, Jr. and 

E. L. Lehmann, University of California, Berkeley. 
Some Notes on the Application of Sequential Methods in the Analysis of 
Variance. N. L. Johnson, University of North Carolina. 
The Covariances of Frequencies from a Multinomial Distribution under a 
Sequential Sampling Rule. M. C. K. Tweedie, Virginia Polytechnic Insti- 
tute. Introduced by B. Harshbarger, Virginia Polytechnic Institute. 
The Group of Transformations Leaving a Set of Normal Distributions In- 
variant. (By title.) E. L. Lehmann, University of California, Berkeley, 
and C. M. Stein, University of Chicago. 
A Nonparametric Two Sample Life Test. (By title.) Benjamin Epstein, 
Wayne University. 
The Large Sample Power of a Test Based on Dichotomization. (By title.) 
C. K. Tsao, Wayne University. 
An Epsilon-complete Class of Nonsequential Decision Procedures in the 
Finite Case. (By title.) Lionel Weiss, University of Virginia. 
Approximate Distribution of Extreme Values of the Range. M. H. Belz, Uni- 
versity of Melbourne and Princeton University, and R. H. Hooke, Prince- 
ton University. 


Social Choice Functions. 10:00 A.M.—11:50 A.M. 


Co-sponsor: Econometric Society. 

Chairman: Howard R. Bowen, Williams College. 

A Suggested Approach to Problems of Group Choice. C. Hildreth, University of Chicago 
and Cowles Commission for Research in Economics. 

Some Sociological Observations on Decision Making. E. A. Shils and E. C. Banfield, Uni- 
versity of Chicago. 

Discussion: Abba P. Lerner, Roosevelt College and Alexander Henderson, Carnegie 
Institute of Technology. 


Special Invited Address. 12:00 noon-1:00 P.M. 


Chairman: Herbert Solomon, Columbia University. 


Asymptotic Estimation Theory. Jacob Wolfowitz, Cornell University and the University 
of California at Los Angeles. 


Organized Decision Making. 2:00 P.M.-3:50 P.M. 


Co-sponsor: Econometric Society. 

Chairman: S. B. Littauer, Columbia University. 

Basic Problems in the Economic Theory of Teams. J. Marschak, University of Chicago 
and Cowles Commission for Research in Economics. 

The Study of an Information Processing Center. A. Newell, Rand Corporation. 

Discussion: Daniel Grey, Massachusetts [Institute of Technology and N. Rashevsky, 
University of Chicago. 
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Multiple Comparison Methods in the Analysis of Variance. 4:00 P.M.—5:50 
P.M. 


Chairman: W. G. Madow, University of Illinois. 

Various Methods from a Unified Point of View. J. W. Tukey, Princeton University. 

Discussion: J. Neyman, University of California, Berkeley and J. Cornfield, National 
Institute of Health. 


Contributed Papers (26) and (27), by R. C. Bose and 8. N. Roy, were presented at this 
session. 


1953 Council Meeting. 6:00 P.M. 


TUESDAY, DECEMBER 30, 1952 
CONTRIBUTED PAPERS II. 10:00 A.M.-11:50 A.M. 


Chairman: C. I. Bliss, The Connecticut Agricultural Experiment Station. 
Papers: (25) Estimate of the Interval Rate for Actuarial Calculation. (Preliminary Report.) 
J. Berkson, Mayo Clinic. 
(26) Simultaneous Confidence Interval Estimation.* R. C. Bose and 8. N. Roy, 

University of North Carolina. 

On a Set of Simultaneous Confidence Interval Statements in Multivariate 

Analysis of Variance.* S. N. Roy and R. C. Bose, University of North 

Carolina. 

28) On a Relevant Set of Simultaneous Confidence Interval Statements in Dis- 
criminant Analysis. (By title.) 8. N. Roy and R. C. Bose, University of 
North Carolina. 

(29) The Separation of Product Error and Inspection Error in Sampling by 
Attributes. C. C. Craig, University of Michigan. 

(30) The Relation Between Fisher’s Discriminant Function and Wald’s Classifica- 
tion Statistic. H. L. Harter, Wright-Patterson Air Force Base. 

(31) On the Distribution of the Ratio of the Ith Observation in an Ordered Sample 
from a Normal Population to an Independent Estimate of the Standard Devi- 
ation. K. C. S. Pillai and K. V. Ramachandran, University of North 
Carolina. 

Cyclic Solutions of Symmetrical Group Divisible Designs. 8. 8. Shrikhande, 
University of Kansas. 

Measures of Association in Contingency Tables. (By title.) L. A. Goodman 
and W. H. Kruskal, University of Chicago. 

(34) Estimating Longevities from Sample Composition. (By title.) L. A. Goodman, 
University of Chicago. 


27) 


(* Contributed Papers (26) and (27) were presented at the Session on Multiple Com- 
parison Methods in the Analysis of Variance.) 


Invited Addresses. 12:00 noon-1:50 P.M. 


Chairman: L. J. Savage, University of Chicago. 

Stochastic Processes Occurring in the Theory of Queues and their Analysis by the Method 
of the Imbedded Markov Chain. David G. Kendall, University of Oxford and Princeton 
University. 

Saddle-point Approximations in Statistics. H. E. Daniels, University of Cambridge and 
University of Chicago. 

WiLuiaM KRUSKAL 
Associate Secretary 
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MINUTES OF THE ANNUAL BUSINESS MEETING, 
CHICAGO, DECEMBER 28, 1952 


The meeting was called to order at 8:45 A.M., Sunday, December 28, 1952 
at the Palmer House Hotel, Chicago, Illinois. 

The reports of the President, Secretary-Treasurer and Editor were read and 
accepted. These reports are printed elsewhere in this issue. 

A motion that the Institute extend its thanks to the retiring Editor for the 
continued development of the Annals under his editorship was carried unani- 
mously. 

A motion that the Institute thank the retiring President for his sacrificial 
and devoted service was carried unanimously. 

The tellers reported the election of E. G. Olds as President-Elect and of W. G. 
Cochran, Churchill Eisenhart, Henry Scheffé and J. W. Tukey as members of 
the Council 1953-55. 

The chair was turned over to the new President, M. H. Hansen. 

A discussion of Annual and Regional Meetings took place. There were expres- 
sions of interest in avoiding Sunday meetings and in avoiding meetings during 
the period between Christmas and New Years Day. 

The meeting adjourned at 9:25 A.M. 

K. J. ARNOLD 
Secretary 


REPORT OF THE PRESIDENT OF THE INSTITUTE FOR 1952 


The affairs of the Institute this year continued to be in good shape both 
intellectually and financially. Substantial progress has been made in the develop- 
ment of statistical theory. This is evidenced by the scope of the topics presented 
at the regional meetings held at Blacksburg, Virginia, and Eugene, Oregon, and 
the national meetings held at East Lansing, Michigan, and Chicago, Illinois. 
The statistical progress is also evidenced by the volume and quality of the papers 
submitted to and published in the Annals. There has been a fair growth in mem- 
bership and our financial situation is sound. You will hear more details about 
our progress from the Editor of the Annals and the Secretary-Treasurer. 

On July 1 of this year, K. J. Arnold became the new Secretary-Treasurer of 
the Institute, replacing Carl H. Fischer whose term expired. I wish to take this 
occasion on behalf of the Institute to thank Professor Fischer for his devoted 
service while in office, Professor Edwin G. Olds and his Committee for finding 
such an excellent successor, and Michigan State College for making it possible 
for Professor Arnold to serve the Institute in this capacity. 

The expiration of my term as president coincides with the expiration of the 
term of office of the entire Editorial Board of the Annals. I wish to welcome the 
new Editorial Board under the editorship of Erich L. Lehmann and express the 
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gratitude of the Institute to T. W. Anderson, for his tireless efforts on behalf 
of the Annals and the excellent job he and his editorial board have done. 

As in the past, much of the work of the Institute this year has been carried 
on by committees. A list of the various committees of the Institute was sent you 
early in 1952, and is reproduced with a few changes at the end of this report. 
During the year it was found necessary to create the following additional com- 
mittees. 

(1) IMS Advisory Committee on Statistical Computations: This committee 
was set up at the request of the Office of Naval Research who needed assistance 
in organizing a program in statistical computations. It consists of Z. W. Birn- 
baum, Chairman, F. C. Mosteller, A. H. Bowker, and Churchill Eisenhart. 

(2) Editorial Committee for the Publication of Wald’s Selected Papers: This 
committee was set up to implement the recommendation of the Wald Memorial 
Committee and ensuing action of the Council which was that the IMS sponsor 
a volume of selected papers by Wald to be published by the McGraw-Hill 
Company. The members of this committee consist of: T. W. Anderson, Chair- 
man, Harald Cramér, H. A. Freeman, E. L. Lehmann, Joseph L. Hodges, A. M. 
Mood, and C. M. Stein. A report of the work of this committee will be presented 
by its chairman. 

(3) IMS Committee on Life Membership Rates: This is a committee of one. 
Carl H. Fischer, whom I appointed to look into the desirability of revising the 
IMS Life Membership Rates which appear to be out of date. 

(4) West Coast Program Committee for the Joint Meeting with AAAS in 
San Francisco in 1954: This committee consists of the established West Coast 
Program Committee with the addition of Elizabeth Scott who has the respon- 
sibility of organizing joint IMS and AAAS sessions for the 1954 Christmas meet- 
ings of the AAAS in San Francisco. 

I wish to express the appreciation of the Institute to all chairmen and members 
of the various committees, and the representatives of the Institute for carrying 
out the essential tasks of the Institute many of which are, unfortunately, time 
consuming and tedious. In particular, I would like to thank the chairmen of the 
program committees and the associate and assistant secretaries for the excellent 
job they did this year in organizing the regional and national meetings. Professor 
David Blackwell who was program coordinator for all the meetings and who is 
the chairman of the program committee for this meeting deserves particular 
commendation. I should also like to commend Herbert Solomon and his Com- 
mittee for organizing such an excellent set of invited addresses. 

I would like to appoint the following to the Nominating Committee next 
year: 


Joseph L. Hodges, Jr., Chairman 
Leonard J. Savage 

Howard Levene 

Solomon Kullback 

Joseph F. Daly 


Herbert Solomon 
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Committees of the Institute, 1952 
1. The Council and Committees of the Council 


(a) Members of the Council 
Term expires 1952 Term expires 1953 Term expires 1954 
David Blackwell Harald Cramér A. H. Bowker 
W. G. Madow A. M. Mood T. C. Koopmans 
F. C. Mosteller Jerzy Neyman H. E. Robbins 
L. J. Savage S. S. Wilks H. G. Romig 
(b) Executive Committee 
M. A. Girshick, President 
Morris Hansen, Pres-Elect 
C. H. Fischer, Sec. and Treas. (until July 1, 1952) 
K. J. Arnold, Sec. and Treas. (after July 1, 1952) 
T. W. Anderson, Editor 
Committee on Fellows 
H. E. Robbins, Chairman (term expires 1952) 
ik. L. Lehmann (term expires 1952) 
Jerzy Neyman (term expires 1953) 
Churchill Eisenhart (term expires 1953) 
Gerhard Tintner (term expires 1954) 
S.S. Wilks (term expires 1954) 


2. Committees Related to Program 


(a) Annual Meeting—Chicago (b) September Meeting—East Lansing 
David Blackwell, Chairman Benjamin Epstein, Chairman 

J. F. Daly Elizabeth Scott 

Michel Loéve Joseph Steinberg 


Paul G. Hoel Leonid Hurwicz 
W. H. Kruskal Ted Harris 
W. G. Cochran Howard Levene 
T. Bancroft R. G. D. Steel 
Paul Gutt, Asst.-Secretary Leo A. Goodman 
Leo Katz, Asst.-Secretary 
(c) Eastern Region (d) Central Region (e) Western Region 
R. A. Bradley, Chairman Oscar Kempthorne, Chairman W. H. Dixon, Chairman 
D. F. Votaw D. A. Darling Z. W. Birnbaum 
H. E. Robbins Leonid Hurwicz A. H. Bowker 
I. D. J. Bross W. H. Kruskal J. L. Hodges 
8. L. Crump K. O. May P. G. Hoel 
W. N. Hurwitz O. R. Whitney A. M. Mood 
M. A. Woodbury 
(f) Program Coordinator (g) Special Invited Papers 
David Blackwell Herbert Solomon, Chairman 
(The Program Coordinator is ex-officio T. W. Anderson 
member of all program committees) David Blackwell 
Benjamin Epstein 
R. A. Bradley 
Oscar Kempthorne 
W. J. Dixon 
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3. Promotional Committees 


(a) Membership (b) Institutional Members (ec) Subscription 

W. D. Baten, Ch. Academic Non-Academic Marion Sandomire, Ch. 
C. R. Blyth I. W. Burr, Ch. H. G. Romig, Ch. Leo Goodman 

D. B. DeLury A. H. Bowker C. C. Hurd F. C. Leone 

T. N. E. Greville L. A. Knowler F. E. Grubbs 

D. E. South A. E. Treloar Jack Sherman 


Other Committees 


(a) Nominating Committee (b) Rietz Lecture Committee 
Appointed by 1951 President P. S. Dwyer J. Neyman—Chairman 

W. E. Deming—Chairman C. C. Craig 

C. A. Bennett Will Feller 

H. W. Norton 

Elizabeth Scott 

P. R. Rider 

(c) Standards for Statisticians in Government Service (d) Wald Memorial 

W. E. Deming—Chairman Howard Levene—Chairman 


H. F. Dorn T. W. Anderson 
Churchill Eisenhart E. L. Lehmann 


B. J. Tepping 

Herbert Solomon 

(e) Committee for Appointing a New Secre-  (f) Committee for Separating Office of Secre- 
tary-Treasurer tary from that of Treasurer 

E. G. Olds—Chairman Carl H. Fischer—Chairman 

W. D. Baten Paul S. Dwyer—Exz. Of. 


A. H. Bowker K. J. Arnold—New Secretary-Treasurer 
P. S. Olmstead M. A. Girshick—Ez. Of. 

D. F. Votaw Morris Hansen—Exz. Of. 

M. A. Girshick—Exr. Of. 

Morris Hansen—Ex. Of. 


5. Representatives of the Institute for 1952 


(a) To the American Association for the Advancement of Science 
Harold Hotelling (Term expires 1954) 

(b) To the National Research Council, Division of Mathematics 
S.S. Wilks (Term expires 1954) 

(c) T'o the Mathematical Policy Committee 
Mina Rees (Term expires 1952) 

(d) To the Joint Committee for Development of Statistical Applications in Engineering and 

Manufacturing 

Albert H. Bowker (Term expires 1954) 

e) To the American Academy of Political and Social Science 
Benjamin Tepping, Leonard Kent (Terms expire 1952) 

(f) To the Committee on the Mathematical Training of Social Scientists 
W. G. Madow, T. W. Anderson 


December 28, 1952 M. A. GrIrsHICK 
President 
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REPORT OF THE SECRETARY-TREASURER OF 
THE INSTITUTE FOR 1952 


At the beginning of 1952 the Institute had 1227 members. During 1952, 
{37 new members were added, 24 former members were reinstated, 30 members 
resigned, 76 were cancelled for non-payment of dues and 2 died, leaving a mem- 
bership of 1280 at the end of 1952. 

Meetings 51, 52, 53 (14th Summer) and 54 (15th Annual) were held during 
1952. For the Blacksburg, Virginia meeting, March 19-21, R. A. Bradley was 
Program Chairman, Boyd Harshbarger was Assistant Secretary and also per- 
formed most of the duties of Associate Secretary. For the Eugene, Oregon 
meeting, June 19-21, W. J. Dixon was Program Chairman and also performed 
the duties of Assistant Secretary and Associate Secretary. For the East Lansing, 
Michigan meeting, September 2-5, Benjamin Epstein was Program Chairman, 
W. H. Kruskal was Associate Secretary and Leo Katz was Assistant Secretary. 
For the Chicago, Illinois meeting David Blackwell was Program Chairman, 
W. H. Kruskal was Associate Secretary and Paul Gutt was Assistant Secretary. 

On behalf of the Institute the Secretary wishes to express appreciation to all 
of the members just named and tothe other members of the program committees 
for making each of these meetings a success. 


INSTITUTE OF MATHEMATICAL STATISTICS 
Statement of Condition 
December 31, 1952 
ASSETS 
Bank $23 526.18 
Dues Receivable 144.00 
Subscriptions Receivable 1,243.11 
U.S. Government Bonds $888.00 


Total Assets 29,801.29 


LIABILITIES AND RESERVES 


Amount Due Printer for December Issue (Estimate) 2,666.08 
Withholding Tax Payable 151.20 
Miscellaneous Liabilities 50.00 
Reserve for Dues Advanced 

Reserve for Subscriptions Advanced 

Reserve for Life Members 

Reserve for Biometrika Subscriptions 


Total Liabilities and Reserves $12,359.33 
* Surplus (Excess of Assets over Liabilities) 17,441.96 


* Listed assets do not include back issues on hand. If these were valued at 67¢ per 
issue the estimated inventory would be $18,700.37. 
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Revenues and Expenses Statement 


January 1, 1952 to December 31, 1952 
Revenues 
Dues Revenue $12,485.50 
Subscriptions Revenu 8 333 .92 
Sale of Back Issues 7,751.10 
Interest Earned on Bonds 100.00 
Miscellaneous Revenue 330.79 


$29, 001.31 

Expenses 
Printing of Annals Current... ae we $10,325.44 
Reprinting of Back Issues. . ats 8.80 
Miscellaneous Printing, St ationery, Postage 1,356.79 
Salary Expense 3,336.01 
Miscellaneous Office Expense ... 618.97 
Contributions to American Math. Society 242.20 
Binding Expense 130.72 
Editorial Expense . 250.00 
Meeting Expense ! 22.31 
President’s Fund ee v ; 32.86 
Travelling Expense enseheerene a 86.39 
$16,410.49 
Excess of Revenue over Expense nt elaine tie 12,590.82 
Excess of Assets over Liabilities Dec. 31, 1951. sia Te a 4,851.14 


Excess of Assets over Liabilities Dec. 31, 1952.......... , a 17,441.96 


Most of the excess of revenue over expenses for 1952 can be accounted for by 
the sale of back issues. Contrary to expectations, the sale of back issues increased 
markedly in 1952. Because the expense of reprinting back issues is entered as a 
current expense of the year in which the reprinting is done rather than of the 
year in which the copies are sold and because no reprinting was done during 
1952, the sale of back issues during 1952 appears almost entirely as income with 
no associated expense. If the fund for reprinting back issues which appeared in 
the statement of condition for 1948 and 1949 had been continued it would now 
amount to $15,290.55 and would offset a large share of our present surplus. A 
review of income from the sale ‘of back issues and expense of reprinting during 
the five years, 1948 through 1952, follows: 

Income Expense 
1948 2,718.27 $1,968 .50 
1949 3,314.41 2,910.53 
1950 5,465.81 1,313.20 
1951 4,628.12 2,386.11 
1952 7,751.10 8.80 


$23,877.71 $8 , 587.16 
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Most of the remaining excess of revenues over expense is due to an increase 
in the subscription price of Annals and a simultaneous increase in number of 
subscribers. 


K. J. ARNOLD 
Secretary-Treasurer 


i eEREnEEEe __oeeneeieee ee 


REPORT OF THE EDITOR OF THE ANNALS FOR 1952 


The 1952 volume of the Annals was respectfully dedicated to the memory of 
Professor Abraham Wald whose tragic death on December 13, 1950, brought 
to an untimely end his important and prolific contributions to these pages. In 
the March issue Professors J. Wolfowitz, Karl Menger and Gerhard Tintner 
reviewed Wald’s work in statistics, pure mathematics and econometrics and 
paid personal tribute to our late colleague and friend. 

In the 1952 volume, 58 papers were published including 14 notes. The publica- 
tion of abstracts, reports of meetings, news and notices, and other reports brought 
the total number of pages of this volume to 655. During this year the rate of 
submission of papers rose considerably, reflecting the increasing activity and 
number of workers in mathematical statistics. It can be expected that these 
trends will continue. 

The second Special Invited Paper, ‘“The x? Test of Goodness of Fit,’’ by Wil- 
liam G. Cochran, was published in this volume. These expository papers serve 
an important purpsose in reviewing and interpreting developments in statistical 
theory and methodology and should be published more frequently. 

On behalf of the Editorial Committee, the Editor wishes to acknowledge the 
invaluable refereeing assistance of the following: E. W. Barankin, Robert 
Bechhofer, J. R. Blum, C. R. Blyth, R. P. Boas, Jr., A. H. Bowker, R. A. Bradley, 
D. G. Chapman, Herman Chernoff, K. L. Chung, L. J. Cote, D. A. Darling, 
W. J. Dixon, Churchill Eisenhart, D. A. 8S. Fraser, L. A. Goodman, John Gur- 
land, A. 8S. Householder, Eric Immel, 8. L. Isaacson, A. T. James, Terry Jeeves, 
G. Kallianpur, E. L. Kaplan, Oscar Kempthorne, D. G. Kendall, Jack Kiefer, 
W. H. Kruskal, Lucien LeCam, Mrs. Emma Lehmer, Roy Leipnik, Roger 
Lyndon, F. J. Massey, Jr., M. R. Mickey, 8. Moriguti, K. R. Nair, G. E. Noether, 
C. R. Ohman, Ingram Olkin, M. P. Peisakoff, Herman Rubin, H. Ryser, I. R. 
Savage, E. L. Scott, Milton Sobel, L. J. Snell, C. M. Stein, J. T. Tate, D. F. 
Votaw, Jr., D. L. Wallace, S. 8S. Wilks, and D. M. G. Wishart. 

Thanks are also due Mr. Jacob Horowitz and Mr. Paul Burke for preparation 
of manuscripts and to Miss Mary Barsamian, Mrs. Clare F. Kozary, and Misses 
Gulnara Natirboff, Gerda Ubel and Maria Vecchione for secretarial and other 
office assistance in connection with the Annals. 

The present Editorial Committee has held responsibility for editing the 
Annals during the past three years and is now turning over the responsibility 
to the incoming Editorial Committee, who have already been considering new 
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papers for future Annals. On behalf of the outgoing Committee the Editor 

takes pleasure in extending good wishes to the new Committee and particularly 

to the new Editor, Erich L. Lehman. 

December 28, 1952 T. W. ANDERSON 
Editor 


er 


PUBLICATIONS RECEIVED 


ANDERSON, R. L. anv Bancrort, T. A., Statistical Theory in Research, McGraw Hill Book 
Co., New York, 1952, xix + 399 pp., $7.00. 

Caruucct, Grancar_o, Le Svalutazioni Monetarie Inglesi del 1981 e del 1949, (Istituto di 
Scienze Economiche e Statistiche, Quaderni XV), Milan, 1952, 27 pp. 

GouLpvEN, C. H., Methods of Statistical Analysis, 2nd ed., John Wiley and Sons, Inc., New 
York, 1952, $7.50. 

JAMBUNATHAN, M. V., Linear Estimation, The Seshadri Press, Mysore, 1951, vi + 84 pp. 

Mck unsey, J.C. C., Introduction to the Theory of Games, McGraw-Hill Book Co., New York, 
1952, ii + 371 pp., $6.50. 

Rao, C. R., Advanced Statistical Methods in Biometrical Research, John Wiley and Sons, 
Inc., New York, 1952, $7.50. 

Recenseamento Geral do Brasil (1° de Setembro de 1940) Censo Demografico and Censos 
Economicos, Servico Grafico de Instituto Brasileiro de Geografia e Estatistica, Rio de 
Janeiro, 1950. (5 volumes in addition to those listed in March and September 1952.) 

Tablas de Mortalidad de la Poblacion Espaiiola, Afios 1900 a 1940, Instituto Nacional de Esta- 
distica, Madrid 1952, 137 pp. 

Tippett, L. H. C., The Methods of Statistics, 4th ed., John Wiley and Sons, Inc., New York, 
1952, 395 pp., $6.00. 

Vinci, Fevice, La Teoria Dell’ Illusione Finanziaria di A. Puviani nel suo Cinquantesimo 
Anniversario, (Nuove Analisi Economiche e Finanziarie), Milan, 1953, 35 pp. 

WauGu, ALBERT E., Statistical Tables and Problems, McGraw-Hill Book Co., New York, 
1952, xiv + 242 pp., $3.00. 








ESTADISTICA 


Journal of the Inter American Statistical Institute 


Vol. X, No. 37 December 1952 


Contents 
Utilizacién de los Datos Censales en el CAlculo de Ingreso y Producto Nacional y en 
el Analisis Econémico ; MIGUEL FADUL 
Methodology and Summary Results of the e 1950 Birth Registration Test in the United 
States saves SaM SHAPIRO AND JOSEPH SCHACHTER 
Muestras y Censos ; ENRIQUE CANSADO 
Document Sensing in Large-Scale Enumen rative Surve eys 
Ruta D. BoTHWELL AND DANIEL B. LEVINE 
Identificacién Rural Previa como Sustituto de la Cartograffa: IV Censo Agropecuario 
de la Reptblica Dominicana MiicrapEs D. HERRERA B. 
E] Concepto de Actividad Productiva bettie -Irvinc H. SIEGEL 
Utilizacién de los Resultados del Censo de las Américas en el Servicio Social 


Resultados Censales Preliminares Obtenibles por Elaboracién Adelantada de una 
Muestra 


Intensive Orientation Program in Labor Force Statistics—Programa de Orientacién 
Intensiva en Estadisticas de Fuerza del Trabajo. ... Epwin D. GoLpFIeLp 


Inter-American Training Center for Economic and Financial Statistics, Santiago, 
Chile, 1953 


Institute Affairs. Statistical News. Publications. 


Published quarterly; annual subscription price $3.00 (U.S.); single copies $1.00 (U.S.) 
Inter American Statistical Institute, % Pan American Union, Washington 6, D.C., U. S. A. 


JOURNAL OF THE 
AMERICAN STATISTICAL ASSOCIATION 
March, 1953 


1108 16th St., N.W. Washington 6, D. C. VOL. 48 NO. 261 


Data for Measuring the Effectiveness of Public Income-Maintenance Programs 
Jacos FISHER 


The Circular Norma! Distribution: Theory and Tables 
E. J. GumBer, J. ARTHUR GREENWOOD, AND Davin DurAND 

Probabilities of Certain Solitaire Card Games Rospert E. GREENWOOD 
On Probabilities in Bridge Dan F. WAUGH AND FREDERICK V. WAUGH 
Delimitation of Economic Areas: Statistical Conceptions in the Study of the Spatial Structure 
of an Economic System RUTLEDGE VINING 
Non-Linear Functional Relationship between Two Variables when One Variable is Controlled 
R. C. Geary 

Estimating the Ratio between the Proportions of Two Classes when One is a Sub-Class of 
the Other Jack M. ELKIN 
Labor Productivity in the Soviet Union IRVING H. SIEGEL 
Combination of Neighboring Cells in Contingency Tables C. C. CRAIG 
An Appraisal of the 1950 Census Income Data HERMAN P. MILLER 
Approximating the Mode from Weighted Sample Values Howarp L. Jones 


Reviews 


THE AMERICAN STATISTICAL ASSOCIATION INVITES 
AS MEMBERS ALL PERSONS INTERESTED IN: 

1. Development of new theory and method 

2. Improvement of basic statistical data 

3. Application of statistical methods to practical problems. 





ECONOMETRICA 


Journal of the Econometric Society 
Contents of Vol. 20, October, 1952, include: 


JAMES TOBIN a ..A Survey of the Theory of Rationing 
MELVIN E. SALVESON ....On a Quantit: itive Method in Production Planning 
and Scheduling 
D. G. CHAMPERNOWNE..... The Graduation of Income Distributions 
JULIAN L. HOLLEY..... A Dy namic Model: I. Principles of Model Structure 
MartTIN BECKMANN. A Continuous Model of Transportation 
H. WoLp Ordinal Preferences or Cardinal U oe ? (With Additional Notes 
by G. L. S. SHackie, L. J. SavaGE, aND H. Wotp) 
ALAN S. MANNE The Strong Independence ees -Gasoline Blends 
and Probability Mixtures (With Additional Note by A. CHARNES) 
Paut A. SAMUELSON .Probability, Utility, and the Independence Axiom 
E. MALINVAUD .. Note on von Neuman-Morgenstern’s Strong Independence 
Axiom 
KENNETH O. May..... A Set of Independent Necessary and Sufficient Condi- 
tions for Simple Majority Decision 

I. N. Herstein..........Comments on Solow’s ‘‘Structure of Linear Models’’ 
RAGNAR FRISCH..... A Note on Pierre Gorra’s Contribution on Index Numbers 

Book Reviews, Announcements, Data on Members and Subscribers 


Published Quarterly Subscription rates available on request 
The Econometric Society is an international society for the advancement of economic theory in its 
relation to statistics and mathematics 
Subscriptions to Econometrica and inquiries about the work of the Society and the procedure in applying 
for membership should be addressed to Rossen L. Cardwell, Acting Secretary, The Econometric Society, 
The University of Chicago, Chicago 37, Illinois, U. 8. A. 





BIOMETRIKA 
A Journal for the Statistical Study of Biological Problems 


Volume 39 Contents Parts 3 and 4, December 1952 


1. Estimation of population parameters from data obtained by means of capture-recapture method, Part II. 
By P. H. LESLIE. 2. Tensor notation and the sampling cumulants of k-statistics. By E. L. KAPLAN. 
3. Estimation in double sampling. By D. R. COX. 4. Sampling from bivariate, non-normal universes. 
By H. HYRENIUS. 5. The truncated Poisson distribution. By P. G. MOORE. 6. Upper 5% and 1% 
points of the ratio stmax/s*’min. By H. DAVID. 7. Conditions under which Gram-Charlier and Edge- 
worth curves are positive definite and unimodal. By D. E. BARTON and K.E. DENNIS. 8. Ona two- 
sided sequential t-test. By S. RUSHTON. 9. Properties of distribution based on certain simple trans- 
formation of the normal curve. By J. DRAPER. 10. Estimation of the mean and standard deviation of 
a normal population from a censored sample. By A. K. GUPTA. 11. The rank analysis of incomplete 
block designs. By R. A. BRADLEY and M. E. TERRY. 12. The statistical structure of ecological com- 
munities. By J. G. SKELLAM. 13. The growth, survival, wande — and variation of the long-tailed 
field mouse. Part I[I—Wandering and recapture. By H. P. and H. 8. HACKER. 14. Use of scores for 
the analysis of association in contingency tables. By E. J. WILLIA nog 15. Statistical significance of odd 
bits of information. By M.S. BARTLETT. 16. Tests of fit in time series. By P. WHITTLE. 17. The 
fitting of grouped truncated and grouped censored normal distributions. By P. M. GRUNDY. 18. MIS- 
CELLANEA: Samples with the same number in each stratum. By W. L. STEVENS. Comparison of 
analysis of variance power functions in the parametric and random models. By N. L.JOHNSON. Approxi- 
mation to the probability integral of the distribution of range. By N. L. JOHNSON. Discrimination in 
time series analysis. By A. RUDRA. Statistical contro] of counting experiments. By H. O. LAN- 
CASTER. Exact grouping corrections to moments and cumulants. By M. KUPPERMAN. 


The subscription price, payable in advance, is 458. inland, 54s. export (per volume including postage). Cheques 
should be drawn to Biometrika and sent to “The Secretary, Biometrika Office, Department of Statistics, 
University Co'lege, London, W.C. 1.”’ All foreign cheques must be in sterling and drawn on a bank 
having a London agency. 





MATHEMATICAL REVIEWS 


A journal containing reviews of the mathematical liter- 
ature of the world, with full subject and author indices 


Publication of this journal is sponsored by the American Mathe- 
matical Society, Mathematical Association of America, Institute of 
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